-
Constructing efficient spatial discretizations of spans of multivariate Chebyshev polynomials
Authors:
Lutz Kämmerer
Abstract:
For an arbitrary given span of high-dimensional multivariate Chebyshev polynomials, an approach to construct spatial discretizations is presented, i.e., the construction of a sampling set that allows for the unique reconstruction of each polynomial of this span.
The approach presented here combines three different types of efficiency. First, the construction of the spatial discretization should…
▽ More
For an arbitrary given span of high-dimensional multivariate Chebyshev polynomials, an approach to construct spatial discretizations is presented, i.e., the construction of a sampling set that allows for the unique reconstruction of each polynomial of this span.
The approach presented here combines three different types of efficiency. First, the construction of the spatial discretization should be efficient with respect to the dimension of the span of the Chebyshev polynomials. Second, the number of sampling nodes within the constructed discretizations should be efficient, i.e., the oversampling factors should be reasonable. Third, there should be an efficient method for the unique reconstruction of a polynomial from given sampling values at the sampling nodes of the discretization.
The first two mentioned types of efficiency are also present in constructions based on random sampling nodes, but the lack of structure here causes the inefficiency of the reconstruction method. Our approach uses a combination of cosine transformed rank-1 lattices whose structure allows for applications of univariate fast Fourier transforms for the reconstruction algorithm and is thus a priori efficiently realizable.
Besides the theoretical estimates of numbers of sampling nodes and failure probabilities due to a random draw of the used lattices, we present several improvements of the basic design approach that significantly increases its practical applicability. Numerical tests, which discretize spans of multivariate Chebyshev polynomials depending on up to more than 50 spatial variables, corroborate the theoretical results and the significance of the improvements.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
The Inefficiency of Genetic Programming for Symbolic Regression -- Extended Version
Authors:
Gabriel Kronberger,
Fabricio Olivetti de Franca,
Harry Desmond,
Deaglan J. Bartlett,
Lukas Kammerer
Abstract:
We analyse the search behaviour of genetic programming for symbolic regression in practically relevant but limited settings, allowing exhaustive enumeration of all solutions. This enables us to quantify the success probability of finding the best possible expressions, and to compare the search efficiency of genetic programming to random search in the space of semantically unique expressions. This…
▽ More
We analyse the search behaviour of genetic programming for symbolic regression in practically relevant but limited settings, allowing exhaustive enumeration of all solutions. This enables us to quantify the success probability of finding the best possible expressions, and to compare the search efficiency of genetic programming to random search in the space of semantically unique expressions. This analysis is made possible by improved algorithms for equality saturation, which we use to improve the Exhaustive Symbolic Regression algorithm; this produces the set of semantically unique expression structures, orders of magnitude smaller than the full symbolic regression search space. We compare the efficiency of random search in the set of unique expressions and genetic programming. For our experiments we use two real-world datasets where symbolic regression has been used to produce well-fitting univariate expressions: the Nikuradse dataset of flow in rough pipes and the Radial Acceleration Relation of galaxy dynamics. The results show that genetic programming in such limited settings explores only a small fraction of all unique expressions, and evaluates expressions repeatedly that are congruent to already visited expressions.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Femtosecond spin-state switching dynamics of spin-crossover molecules condensed in thin films
Authors:
Lea Kämmerer,
Gérald Kämmerer,
Manuel Gruber,
Jan Grunwald,
Tobias Lojewski,
Laurent Mercadier,
Loïc Le Guyader,
Robert Carley,
Cammille Carinan,
Natalia Gerasimova,
David Hickin,
Benjamin E. Van Kuiken,
Giuseppe Mercurio,
Martin Teichmann,
Senthil Kumar Kuppusamy,
Andreas Scherz,
Mario Ruben,
Klaus Sokolowski-Tinten,
Andrea Eschenlohr,
Katharina Ollefs,
Carolin Schmitz-Antoniak,
Felix Tuczek,
Peter Kratzer,
Uwe Bovensiepen,
Heiko Wende
Abstract:
The photoinduced switching of Fe(II)-based spin-crossover complexes from singlet to quintet takes place at ultrafast time scales. This a priori spin-forbidden transition triggered numerous time-resolved experiments of solvated samples to elucidate the mechanism at play. The involved intermediate states remain uncertain. We apply ultrafast x-ray spectroscopy in molecular films as a method sensitive…
▽ More
The photoinduced switching of Fe(II)-based spin-crossover complexes from singlet to quintet takes place at ultrafast time scales. This a priori spin-forbidden transition triggered numerous time-resolved experiments of solvated samples to elucidate the mechanism at play. The involved intermediate states remain uncertain. We apply ultrafast x-ray spectroscopy in molecular films as a method sensitive to spin, electronic, and nuclear degrees of freedom. Combining the progress in molecule synthesis and film growth with the opportunities at x-ray free-electron lasers, we analyze the transient evolution of the Fe L3 fine structure at room temperature. Our measurements and calculations indicate the involvement of an Fe triplet intermediate state. The high-spin state saturates at half of the available molecules, limited by molecule-molecule interaction within the film.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
A precise symbolic emulator of the linear matter power spectrum
Authors:
Deaglan J. Bartlett,
Lukas Kammerer,
Gabriel Kronberger,
Harry Desmond,
Pedro G. Ferreira,
Benjamin D. Wandelt,
Bogdan Burlacu,
David Alonso,
Matteo Zennaro
Abstract:
Computing the matter power spectrum, $P(k)$, as a function of cosmological parameters can be prohibitively slow in cosmological analyses, hence emulating this calculation is desirable. Previous analytic approximations are insufficiently accurate for modern applications, so black-box, uninterpretable emulators are often used. We utilise an efficient genetic programming based symbolic regression fra…
▽ More
Computing the matter power spectrum, $P(k)$, as a function of cosmological parameters can be prohibitively slow in cosmological analyses, hence emulating this calculation is desirable. Previous analytic approximations are insufficiently accurate for modern applications, so black-box, uninterpretable emulators are often used. We utilise an efficient genetic programming based symbolic regression framework to explore the space of potential mathematical expressions which can approximate the power spectrum and $σ_8$. We learn the ratio between an existing low-accuracy fitting function for $P(k)$ and that obtained by solving the Boltzmann equations and thus still incorporate the physics which motivated this earlier approximation. We obtain an analytic approximation to the linear power spectrum with a root mean squared fractional error of 0.2% between $k = 9\times10^{-3} - 9 \, h{\rm \, Mpc^{-1}}$ and across a wide range of cosmological parameters, and we provide physical interpretations for various terms in the expression. Our analytic approximation is 950 times faster to evaluate than camb and 36 times faster than the neural network based matter power spectrum emulator BACCO. We also provide a simple analytic approximation for $σ_8$ with a similar accuracy, with a root mean squared fractional error of just 0.1% when evaluated across the same range of cosmologies. This function is easily invertible to obtain $A_{\rm s}$ as a function of $σ_8$ and the other cosmological parameters, if preferred. It is possible to obtain symbolic approximations to a seemingly complex function at a precision required for current and future cosmological analyses without resorting to deep-learning techniques, thus avoiding their black-box nature and large number of parameters. Our emulator will be usable long after the codes on which numerical approximations are built become outdated.
△ Less
Submitted 15 April, 2024; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Photo-induced charge-transfer renormalization in NiO
Authors:
Tobias Lojewski,
Denis Golez,
Katharina Ollefs,
Loïc Le Guyader,
Lea Kämmerer,
Nico Rothenbach,
Robin Y. Engel,
Piter S. Miedema,
Martin Beye,
Gheorghe S. Chiuzbăian,
Robert Carley,
Rafael Gort,
Benjamin E. Van Kuiken,
Giuseppe Mercurio,
Justina Schlappa,
Alexander Yaroslavtsev,
Andreas Scherz,
Florian Döring,
Christian David,
Heiko Wende,
Uwe Bovensiepen,
Martin Eckstein,
Philipp Werner,
Andrea Eschenlohr
Abstract:
Photo-doped states in strongly correlated charge transfer insulators are characterized by $d$-$d$ and $d$-$p$ interactions and the resulting intertwined dynamics of charge excitations and local multiplets. Here we use femtosecond x-ray absorption spectroscopy in combination with dynamical mean-field theory to disentangle these contributions in NiO. Upon resonant optical excitation across the charg…
▽ More
Photo-doped states in strongly correlated charge transfer insulators are characterized by $d$-$d$ and $d$-$p$ interactions and the resulting intertwined dynamics of charge excitations and local multiplets. Here we use femtosecond x-ray absorption spectroscopy in combination with dynamical mean-field theory to disentangle these contributions in NiO. Upon resonant optical excitation across the charge transfer gap, the Ni $L_3$ and O $K$ absorption edges red-shift for $>10$ ps, associated with photo-induced changes in the screening environment. An additional signature below the Ni $L_3$ edge is identified for $<1$ ps, reflecting a transient nonthermal population of local many-body multiplets. We employ a nonthermal generalization of the multiplet ligand field theory to show that the feature originates from $d$-$d$ transitions. Overall, the photo-doped state differs significantly from a chemically doped state. Our results demonstrate the ability to reveal excitation pathways in correlated materials by x-ray spectroscopies, which is relevant for ultrafast materials design.
△ Less
Submitted 24 May, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Nonlinear approximation in bounded orthonormal product bases
Authors:
Lutz Kämmerer,
Daniel Potts,
Fabian Taubert
Abstract:
We present a dimension-incremental algorithm for the nonlinear approximation of high-dimensional functions in an arbitrary bounded orthonormal product basis. Our goal is to detect a suitable truncation of the basis expansion of the function, where the corresponding basis support is assumed to be unknown. Our method is based on point evaluations of the considered function and adaptively builds an i…
▽ More
We present a dimension-incremental algorithm for the nonlinear approximation of high-dimensional functions in an arbitrary bounded orthonormal product basis. Our goal is to detect a suitable truncation of the basis expansion of the function, where the corresponding basis support is assumed to be unknown. Our method is based on point evaluations of the considered function and adaptively builds an index set of a suitable basis support such that the approximately largest basis coefficients are still included. For this purpose, the algorithm only needs a suitable search space that contains the desired index set. Throughout the work, there are various minor modifications of the algorithm discussed as well, which may yield additional benefits in several situations. For the first time, we provide a proof of a detection guarantee for such an index set in the function approximation case under certain assumptions on the sub-methods used within our algorithm, which can be used as a foundation for similar statements in various other situations as well. Some numerical examples in different settings underline the effectiveness and accuracy of our method.
△ Less
Submitted 27 April, 2023; v1 submitted 11 November, 2022;
originally announced November 2022.
-
Symbolic Regression with Fast Function Extraction and Nonlinear Least Squares Optimization
Authors:
Lukas Kammerer,
Gabriel Kronberger,
Michael Kommenda
Abstract:
Fast Function Extraction (FFX) is a deterministic algorithm for solving symbolic regression problems. We improve the accuracy of FFX by adding parameters to the arguments of nonlinear functions. Instead of only optimizing linear parameters, we optimize these additional nonlinear parameters with separable nonlinear least squared optimization using a variable projection algorithm. Both FFX and our n…
▽ More
Fast Function Extraction (FFX) is a deterministic algorithm for solving symbolic regression problems. We improve the accuracy of FFX by adding parameters to the arguments of nonlinear functions. Instead of only optimizing linear parameters, we optimize these additional nonlinear parameters with separable nonlinear least squared optimization using a variable projection algorithm. Both FFX and our new algorithm is applied on the PennML benchmark suite. We show that the proposed extensions of FFX leads to higher accuracy while providing models of similar length and with only a small increase in runtime on the given data. Our results are compared to a large set of regression methods that were already published for the given benchmark suite.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
On the reconstruction of functions from values at subsampled quadrature points
Authors:
Felix Bartel,
Lutz Kämmerer,
Daniel Potts,
Tino Ullrich
Abstract:
This paper is concerned with function reconstruction from samples. The sampling points used in several approaches are (1) structured points connected with fast algorithms or (2) unstructured points coming from, e.g., an initial random draw to achieve an improved information complexity. We connect both approaches and propose a subsampling of structured points in an offline step. In particular, we s…
▽ More
This paper is concerned with function reconstruction from samples. The sampling points used in several approaches are (1) structured points connected with fast algorithms or (2) unstructured points coming from, e.g., an initial random draw to achieve an improved information complexity. We connect both approaches and propose a subsampling of structured points in an offline step. In particular, we start with structured quadrature points (QMC), which provide stable $L_2$ reconstruction properties. The subsampling procedure consists of a computationally inexpensive random step followed by a deterministic procedure to further reduce the number of points while kee** its information. In these points functions (belonging to a RKHS of bounded functions) will be sampled and reconstructed from whilst achieving state of the art error decay. Our method is dimension-independent and is applicable as soon as we know some initial quadrature points. We apply our general findings on the $d$-dimensional torus to subsample rank-1 lattices, where it is known that full rank-1 lattices lose half the optimal order of convergence (expressed in terms of the size of the lattice). In contrast to that, our subsampled version regains the optimal rate since many of the lattice points are not needed. Moreover, we utilize fast and memory efficient Fourier algorithms in order to compute the approximation. Numerical experiments in several dimensions support our findings.
△ Less
Submitted 5 June, 2023; v1 submitted 29 August, 2022;
originally announced August 2022.
-
Cluster Analysis of a Symbolic Regression Search Space
Authors:
Gabriel Kronberger,
Lukas Kammerer,
Bogdan Burlacu,
Stephan M. Winkler,
Michael Kommenda,
Michael Affenzeller
Abstract:
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target function. For our analysis, we use a restricted gramma…
▽ More
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target function. For our analysis, we use a restricted grammar for uni-variate symbolic regression models and generate all possible models up to a fixed length limit. We identify unique models and cluster them based on phenotypic as well as genotypic similarity. We find that phenotypic similarity leads to well-defined clusters while genotypic similarity does not produce a clear clustering. By map** solution candidates visited by GP to the enumerated search space we find that GP initially explores the whole search space and later converges to the subspace of highest quality expressions in a run for a simple benchmark problem.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication
Authors:
Lukas Kammerer,
Gabriel Kronberger,
Bogdan Burlacu,
Stephan M. Winkler,
Michael Kommenda,
Michael Affenzeller
Abstract:
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic programming for symbolic regression. In this chapter we…
▽ More
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic programming for symbolic regression. In this chapter we introduce a deterministic symbolic regression algorithm specifically designed to address these issues. The algorithm uses a context-free grammar to produce models that are parameterized by a non-linear least squares local optimization procedure. A finite enumeration of all possible models is guaranteed by structural restrictions as well as a caching mechanism for detecting semantically equivalent solutions. Enumeration order is established via heuristics designed to improve search efficiency. Empirical tests on a comprehensive benchmark suite show that our approach is competitive with genetic programming in many noiseless problems while maintaining desirable properties such as simple, reliable models and reproducibility.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
The uniform sparse FFT with application to PDEs with random coefficients
Authors:
Lutz Kämmerer,
Daniel Potts,
Fabian Taubert
Abstract:
We develop the uniform sparse Fast Fourier Transform (usFFT), an efficient, non-intrusive, adaptive algorithm for the solution of elliptic partial differential equations with random coefficients. The algorithm is an adaption of the sparse Fast Fourier Transform (sFFT), a dimension-incremental algorithm, which tries to detect the most important frequencies in a given search domain and therefore ada…
▽ More
We develop the uniform sparse Fast Fourier Transform (usFFT), an efficient, non-intrusive, adaptive algorithm for the solution of elliptic partial differential equations with random coefficients. The algorithm is an adaption of the sparse Fast Fourier Transform (sFFT), a dimension-incremental algorithm, which tries to detect the most important frequencies in a given search domain and therefore adaptively generates a suitable Fourier basis corresponding to the approximately largest Fourier coefficients of the function. The usFFT does this w.r.t. the stochastic domain of the PDE simultaneously for multiple fixed spatial nodes, e.g., nodes of a finite element mesh. The key idea of joining the detected frequency sets in each dimension increment results in a Fourier approximation space, which fits uniformly for all these spatial nodes. This strategy allows for a faster and more efficient computation due to a significantly smaller amount of samples needed, than just using other algorithms, e.g., the sFFT for each spatial node separately. We test the usFFT for different examples using periodic, affine and lognormal random coefficients in the PDE problems.
△ Less
Submitted 2 September, 2022; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Data Aggregation for Reducing Training Data in Symbolic Regression
Authors:
Lukas Kammerer,
Gabriel Kronberger,
Michael Kommenda
Abstract:
The growing volume of data makes the use of computationally intense machine learning techniques such as symbolic regression with genetic programming more and more impractical. This work discusses methods to reduce the training data and thereby also the runtime of genetic programming. The data is aggregated in a preprocessing step before running the actual machine learning algorithm. K-means cluste…
▽ More
The growing volume of data makes the use of computationally intense machine learning techniques such as symbolic regression with genetic programming more and more impractical. This work discusses methods to reduce the training data and thereby also the runtime of genetic programming. The data is aggregated in a preprocessing step before running the actual machine learning algorithm. K-means clustering and data binning is used for data aggregation and compared with random sampling as the simplest data reduction method. We analyze the achieved speed-up in training and the effects on the trained models test accuracy for every method on four real-world data sets. The performance of genetic programming is compared with random forests and linear regression. It is shown, that k-means and random sampling lead to very small loss in test accuracy when the data is reduced down to only 30% of the original data, while the speed-up is proportional to the size of the data set. Binning on the contrary, leads to models with very high test error.
△ Less
Submitted 24 August, 2021;
originally announced August 2021.
-
Hash-Based Tree Similarity and Simplification in Genetic Programming for Symbolic Regression
Authors:
Bogdan Burlacu,
Lukas Kammerer,
Michael Affenzeller,
Gabriel Kronberger
Abstract:
We introduce in this paper a runtime-efficient tree hashing algorithm for the identification of isomorphic subtrees, with two important applications in genetic programming for symbolic regression: fast, online calculation of population diversity and algebraic simplification of symbolic expression trees. Based on this hashing approach, we propose a simple diversity-preservation mechanism with promi…
▽ More
We introduce in this paper a runtime-efficient tree hashing algorithm for the identification of isomorphic subtrees, with two important applications in genetic programming for symbolic regression: fast, online calculation of population diversity and algebraic simplification of symbolic expression trees. Based on this hashing approach, we propose a simple diversity-preservation mechanism with promising results on a collection of symbolic regression benchmark problems.
△ Less
Submitted 22 July, 2021;
originally announced July 2021.
-
Identification of Dynamical Systems using Symbolic Regression
Authors:
Gabriel Kronberger,
Lukas Kammerer,
Michael Kommenda
Abstract:
We describe a method for the identification of models for dynamical systems from observational data. The method is based on the concept of symbolic regression and uses genetic programming to evolve a system of ordinary differential equations (ODE). The novelty is that we add a step of gradient-based optimization of the ODE parameters. For this we calculate the sensitivities of the solution to the…
▽ More
We describe a method for the identification of models for dynamical systems from observational data. The method is based on the concept of symbolic regression and uses genetic programming to evolve a system of ordinary differential equations (ODE). The novelty is that we add a step of gradient-based optimization of the ODE parameters. For this we calculate the sensitivities of the solution to the initial value problem (IVP) using automatic differentiation. The proposed approach is tested on a set of 19 problem instances taken from the literature which includes datasets from simulated systems as well as datasets captured from mechanical systems. We find that gradient-based optimization of parameters improves predictive accuracy of the models. The best results are obtained when we first fit the individual equations to the numeric differences and then subsequently fine-tune the identified parameter values by fitting the IVP solution to the observed variable values.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
A fast probabilistic component-by-component construction of exactly integrating rank-1 lattices and applications
Authors:
Lutz Kämmerer
Abstract:
Several more and more efficient component--by--component (CBC) constructions for suitable rank-1 lattices were developed during the last decades. On the one hand, there exist constructions that are based on minimizing some error functional. On the other hand, there is the possibility to construct rank-1 lattices whose corresponding cubature rule exactly integrates all elements within a space of mu…
▽ More
Several more and more efficient component--by--component (CBC) constructions for suitable rank-1 lattices were developed during the last decades. On the one hand, there exist constructions that are based on minimizing some error functional. On the other hand, there is the possibility to construct rank-1 lattices whose corresponding cubature rule exactly integrates all elements within a space of multivariate trigonometric polynomials.
In this paper, we focus on the second approach, i.e., the exactness of rank-1 lattice rules. The main contribution is the analysis of a probabilistic version of an already known algorithm that realizes a CBC construction of such rank-1 lattices. It turns out that the computational effort of the known deterministic algorithm can be considerably improved in average by means of a simple randomization. Moreover, we give a detailed analysis of the computational costs with respect to a certain failure probability, which then leads to the development of a probabilistic CBC algorithm. In particular, the presented approach will be highly beneficial for the construction of so-called reconstructing rank-1 lattices, that are practically relevant for function approximation. Subsequent to the rigorous analysis of the presented CBC algorithms, we present an algorithm that determines reconstructing rank-1 lattices of reasonable lattice sizes with high probability. We provide estimates on the resulting lattice sizes and bounds on the occurring failure probability. Furthermore, we discuss the computational complexity of the presented algorithm.
Various numerical tests illustrate the efficiency of the presented algorithms. Among others, we demonstrate how to exploit the efficiency of our algorithm even for the construction of exactly integrating rank-1 lattices, provided that a certain property of the treated space of trigonometric polynomials is known.
△ Less
Submitted 28 December, 2020;
originally announced December 2020.
-
Sparse Fourier Transforms on Rank-1 Lattices for the Rapid and Low-Memory Approximation of Functions of Many Variables
Authors:
Craig Gross,
Mark Iwen,
Lutz Kämmerer,
Toni Volkmer
Abstract:
We consider fast, provably accurate algorithms for approximating functions on the $d$-dimensional torus, $f: \mathbb{ T }^d \rightarrow \mathbb{C}$, that are sparse (or compressible) in the Fourier basis. In particular, suppose that the Fourier coefficients of $f$, $\{c_{\bf k} (f) \}_{{\bf k} \in \mathbb{Z}^d}$, are concentrated in a finite set $I \subset \mathbb{Z}^d$ so that…
▽ More
We consider fast, provably accurate algorithms for approximating functions on the $d$-dimensional torus, $f: \mathbb{ T }^d \rightarrow \mathbb{C}$, that are sparse (or compressible) in the Fourier basis. In particular, suppose that the Fourier coefficients of $f$, $\{c_{\bf k} (f) \}_{{\bf k} \in \mathbb{Z}^d}$, are concentrated in a finite set $I \subset \mathbb{Z}^d$ so that $$\min_{Ω\subset I s.t. |Ω| =s } \left\| f - \sum_{{\bf k} \in Ω} c_{\bf k} (f) e^{ -2 πi {\bf k} \cdot \circ} \right\|_2 < ε\|f \|_2$$ holds for $s \ll |I|$ and $ε\in (0,1)$. We aim to identify a near-minimizing subset $Ω\subset I$ and accurately approximate the associated Fourier coefficients $\{ c_{\bf k} (f) \}_{{\bf k} \in Ω}$ as rapidly as possible. We present both deterministic as well as randomized algorithms using $O(s^2 d \log^c (|I|))$-time/memory and $O(s d \log^c (|I|))$-time/memory, respectively. Most crucially, all of the methods proposed herein achieve these runtimes while satisfying theoretical best $s$-term approximation guarantees which guarantee their numerical accuracy and robustness to noise for general functions.
These are achieved by modifying several one-dimensional Sparse Fourier Transform (SFT) methods to subsample a function along a reconstructing rank-1 lattice for the given frequency set $I$ to rapidly identify a near-minimizing subset $Ω\subset I$ without using anything about the lattice beyond its generating vector. This requires new fast and low-memory frequency identification techniques capable of rapidly recovering vector-valued frequencies in $\mathbb{Z}^d$ as opposed to simple integer frequencies in the univariate setting. Two different strategies are proposed and analyzed, each with different accuracy versus computational speed and memory tradeoffs.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
A sample efficient sparse FFT for arbitrary frequency candidate sets in high dimensions
Authors:
Lutz Kämmerer,
Felix Krahmer,
Toni Volkmer
Abstract:
In this paper a sublinear time algorithm is presented for the reconstruction of functions that can be represented by just few out of a potentially large candidate set of Fourier basis functions in high spatial dimensions, a so-called high-dimensional sparse fast Fourier transform. In contrast to many other such algorithms, our method works for arbitrary candidate sets and does not make additional…
▽ More
In this paper a sublinear time algorithm is presented for the reconstruction of functions that can be represented by just few out of a potentially large candidate set of Fourier basis functions in high spatial dimensions, a so-called high-dimensional sparse fast Fourier transform. In contrast to many other such algorithms, our method works for arbitrary candidate sets and does not make additional structural assumptions on the candidate set. Our transform significantly improves upon the other approaches available for such a general framework in terms of the scaling of the sample complexity. Our algorithm is based on sampling the function along multiple rank-1 lattices with random generators.
Combined with a dimension-incremental approach, our method yields a sparse Fourier transform whose computational complexity only grows mildly in the dimension and can hence be efficiently computed even in high dimensions. Our theoretical analysis establishes that any Fourier $s$-sparse function can be accurately reconstructed with high probability. This guarantee is complemented by several numerical tests demonstrating the high efficiency and versatile applicability for the exactly sparse case and also for the compressible case.
△ Less
Submitted 23 June, 2020;
originally announced June 2020.
-
A Deterministic Algorithm for Constructing Multiple Rank-1 Lattices of Near-Optimal Size
Authors:
Craig Gross,
Mark A. Iwen,
Lutz Kämmerer,
Toni Volkmer
Abstract:
In this paper we present the first known deterministic algorithm for the construction of multiple rank-1 lattices for the approximation of periodic functions of many variables. The algorithm works by converting a potentially large reconstructing single rank-1 lattice for some $ d $-dimensional frequency set $ I \subset [N]^d $ into a collection of much smaller rank-1 lattices which allow for accur…
▽ More
In this paper we present the first known deterministic algorithm for the construction of multiple rank-1 lattices for the approximation of periodic functions of many variables. The algorithm works by converting a potentially large reconstructing single rank-1 lattice for some $ d $-dimensional frequency set $ I \subset [N]^d $ into a collection of much smaller rank-1 lattices which allow for accurate and efficient reconstruction of trigonometric polynomials with coefficients in $ I $ (and, therefore, for the approximation of multivariate periodic functions). The total number of sampling points in the resulting multiple rank-1 lattices is theoretically shown to be less than $ \mathcal{O}\left( |I| \log^{ 2 }(N |I|) \right) $ with constants independent of $d$, and by performing one-dimensional fast Fourier transforms on samples of trigonometric polynomials with Fourier support in $ I $ at these points, we obtain exact reconstruction of all Fourier coefficients in fewer than $ \mathcal{O}\left(d\,|I|\log^4 (N|I|)\right) $ total operations.
Additionally, we present a second multiple rank-1 lattice construction algorithm which constructs lattices with even fewer sampling points at the cost of only being able to reconstruct exact trigonometric polynomials rather than having additional theoretical approximation. Both algorithms are tested numerically and surpass the theoretical bounds. Notably, we observe that the oversampling factors #samples$/|I|$ appear to grow only logarithmically in $ |I| $ for the first algorithm and appear near-optimally bounded by four in the second algorithm.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Worst-case recovery guarantees for least squares approximation using random samples
Authors:
Lutz Kämmerer,
Tino Ullrich,
Toni Volkmer
Abstract:
We construct a least squares approximation method for the recovery of complex-valued functions from a reproducing kernel Hilbert space on $D \subset \mathbb{R}^d$. The nodes are drawn at random for the whole class of functions and the error is measured in $L_2(D,\varrho_D)$. We prove worst-case recovery guarantees by explicitly controlling all the involved constants. This leads to new preasymptoti…
▽ More
We construct a least squares approximation method for the recovery of complex-valued functions from a reproducing kernel Hilbert space on $D \subset \mathbb{R}^d$. The nodes are drawn at random for the whole class of functions and the error is measured in $L_2(D,\varrho_D)$. We prove worst-case recovery guarantees by explicitly controlling all the involved constants. This leads to new preasymptotic recovery bounds with high probability for the error of Hyperbolic Fourier Regression on multivariate data. In addition, we further investigate its counterpart Hyperbolic Wavelet Regression also based on least-squares to recover non-periodic functions from random samples. Finally, we reconsider the analysis of a cubature method based on plain random points with optimal weights and reveal near-optimal worst-case error bounds with high probability. It turns out that this simple method can compete with the quasi-Monte Carlo methods in the literature which are based on lattices and digital nets.
△ Less
Submitted 1 April, 2021; v1 submitted 22 November, 2019;
originally announced November 2019.
-
Multiple Lattice Rules for Multivariate $L_\infty$ Approximation in the Worst-Case Setting
Authors:
Lutz Kämmerer
Abstract:
We develop a general framework for estimating the $L_\infty(\mathbb{T}^d)$ error for the approximation of multivariate periodic functions belonging to specific reproducing kernel Hilbert spaces (RHKS) using approximants that are trigonometric polynomials computed from sampling values. The used sampling schemes are suitable sets of rank-1 lattices that can be constructed in an extremely efficient w…
▽ More
We develop a general framework for estimating the $L_\infty(\mathbb{T}^d)$ error for the approximation of multivariate periodic functions belonging to specific reproducing kernel Hilbert spaces (RHKS) using approximants that are trigonometric polynomials computed from sampling values. The used sampling schemes are suitable sets of rank-1 lattices that can be constructed in an extremely efficient way. Furthermore, the structure of the sampling schemes allows for fast Fourier transform (FFT) algorithms. We present and discuss one FFT algorithm and analyze the worst case $L_\infty$ error for this specific approach.
Using this general result we work out very weak requirements on the RHKS that allow for a simple upper bound on the sampling numbers in terms of approximation numbers, where the approximation error is measured in the $L_\infty$ norm. Tremendous advantages of this estimate are its pre-asymptotic validity as well as its simplicity and its specification in full detail. It turns out, that approximation numbers and sampling numbers differ at most slightly. The occurring multiplicative gap does not depend on the spatial dimension $d$ and depends at most logarithmically on the number of used linear information or sampling values, respectively. Moreover, we illustrate the capability of the new sampling method with the aid of specific highly popular source spaces, which yields that the suggested algorithm is nearly optimal from different points of view. For instance, we improve tractability results for the $L_\infty$ approximation for sampling methods and we achieve almost optimal sampling rates for functions of dominating mixed smoothness.
Great advantages of the suggested sampling method are the constructive methods for determining sampling sets that guarantee the shown error bounds, the simplicity and extreme efficiency of all necessary algorithms.
△ Less
Submitted 5 September, 2019;
originally announced September 2019.
-
A sparse FFT approach for ODE with random coefficients
Authors:
Maximilian Bochmann,
Lutz Kämmerer,
Daniel Potts
Abstract:
The paper presents a general strategy to solve ordinary differential equations (ODE), where some coefficient depend on the spatial variable and on additional random variables. The approach is based on the application of a recently developed dimension-incremental sparse fast Fourier transform. Since such algorithms require periodic signals, we discuss periodization strategies and associated necessa…
▽ More
The paper presents a general strategy to solve ordinary differential equations (ODE), where some coefficient depend on the spatial variable and on additional random variables. The approach is based on the application of a recently developed dimension-incremental sparse fast Fourier transform. Since such algorithms require periodic signals, we discuss periodization strategies and associated necessary deperiodization modifications within the occuring solution steps.
The computed approximate solutions of the ODE depend on the spatial variable and on the random variables as well. Certainly, one of the crucial challenges of the high dimensional approximation process is to rate the influence of each variable on the solution as well as the determination of the relations and couplings within the set of variables. The suggested approach meets these challenges in a full automatic manner with reasonable computational costs, i.e., in contrast to already existing approaches, one does not need to seriously restrict the used set of ansatz functions in advance.
△ Less
Submitted 16 July, 2019; v1 submitted 6 January, 2019;
originally announced January 2019.
-
Approximation of multivariate periodic functions based on sampling along multiple rank-1 lattices
Authors:
Lutz Kämmerer,
Toni Volkmer
Abstract:
In this work, we consider the approximate reconstruction of high-dimensional periodic functions based on sampling values. As sampling schemes, we utilize so-called reconstructing multiple rank-1 lattices, which combine several preferable properties such as easy constructability, the existence of high-dimensional fast Fourier transform algorithms, high reliability, and low oversampling factors. Esp…
▽ More
In this work, we consider the approximate reconstruction of high-dimensional periodic functions based on sampling values. As sampling schemes, we utilize so-called reconstructing multiple rank-1 lattices, which combine several preferable properties such as easy constructability, the existence of high-dimensional fast Fourier transform algorithms, high reliability, and low oversampling factors. Especially, we show error estimates for functions from Sobolev Hilbert spaces of generalized mixed smoothness. For instance, when measuring the sampling error in the $L_2$-norm, we show sampling error estimates where the exponent of the main part reaches those of the optimal sampling rate except for an offset of $1/2+\varepsilon$, i.e., the exponent is almost a factor of two better up to the mentioned offset compared to single rank-1 lattice sampling. Various numerical tests in medium and high dimensions demonstrate the high performance and confirm the obtained theoretical results of multiple rank-1 lattice sampling.
△ Less
Submitted 10 May, 2019; v1 submitted 19 February, 2018;
originally announced February 2018.
-
High-dimensional sparse FFT based on sampling along multiple rank-1 lattices
Authors:
Lutz Kämmerer,
Daniel Potts,
Toni Volkmer
Abstract:
The reconstruction of high-dimensional sparse signals is a challenging task in a wide range of applications. In order to deal with high-dimensional problems, efficient sparse fast Fourier transform algorithms are essential tools. The second and third authors have recently proposed a dimension-incremental approach, which only scales almost linear in the number of required sampling values and almost…
▽ More
The reconstruction of high-dimensional sparse signals is a challenging task in a wide range of applications. In order to deal with high-dimensional problems, efficient sparse fast Fourier transform algorithms are essential tools. The second and third authors have recently proposed a dimension-incremental approach, which only scales almost linear in the number of required sampling values and almost quadratic in the arithmetic complexity with respect to the spatial dimension $d$. Using reconstructing rank-1 lattices as sampling scheme, the method showed reliable reconstruction results in numerical tests but suffers from relatively large numbers of samples and arithmetic operations. Combining the preferable properties of reconstructing rank-1 lattices with small sample and arithmetic complexities, the first author developed the concept of multiple rank-1 lattices. In this paper, both concepts - dimension-incremental reconstruction and multiple rank-1 lattices - are coupled, which yields a distinctly improved high-dimensional sparse fast Fourier transform. Moreover, the resulting algorithm is analyzed in detail with respect to success probability, number of required samples, and arithmetic complexity. In comparison to single rank-1 lattices, the utilization of multiple rank-1 lattices results in a reduction in the complexities by an almost linear factor with respect to the sparsity. Various numerical tests confirm the theoretical results, the high performance, and the reliability of the proposed method.
△ Less
Submitted 14 November, 2017;
originally announced November 2017.
-
Constructing spatial discretizations for sparse multivariate trigonometric polynomials that allow for a fast discrete Fourier transform
Authors:
Lutz Kämmerer
Abstract:
The paper discusses the construction of high dimensional spatial discretizations for arbitrary multivariate trigonometric polynomials, where the frequency support of the trigonometric polynomial is known. We suggest a construction based on the union of several rank-1 lattices as sampling scheme and call such schemes multiple rank-1 lattices. This approach automatically makes available a fast discr…
▽ More
The paper discusses the construction of high dimensional spatial discretizations for arbitrary multivariate trigonometric polynomials, where the frequency support of the trigonometric polynomial is known. We suggest a construction based on the union of several rank-1 lattices as sampling scheme and call such schemes multiple rank-1 lattices. This approach automatically makes available a fast discrete Fourier transform (FFT) on the data.
The key objective of the construction of spatial discretizations is the unique reconstruction of the trigonometric polynomial using the sampling values at the sampling nodes. We develop construction methods for multiple rank-1 lattices that allow for this unique reconstruction and for estimates of the number $M$ of distinct sampling nodes within the resulting spatial discretizations. Assuming that the multivariate trigonometric polynomial under consideration is a linear combination of $T$ trigonometric monomials, the oversampling factor $M/T$ is independent of the spatial dimension and, roughly speaking, with high probability only logarithmic in $T$, which is much better than the oversampling factor that is expected when using one single rank-1 lattice.
The newly developed approaches for the construction of spatial discretizations are probabilistic methods. The arithmetic complexity of these algorithms depend only linearly on the spatial dimension and, with high probability, only linearly on $T$ up to some logarithmic factors.
Furthermore, we analyze the computational complexities of the resulting FFT algorithms in detail and obtain upper bounds in $\mathcal{O}\left(M\log M\right)$, where the constants depend only linearly on the spatial dimension. With high probability, we construct spatial discretizations where $M/T\le C\log T$ holds, which implies that the complexity of the corresponding FFT converts to $\mathcal{O}\left(T\log^2 T\right)$.
△ Less
Submitted 17 November, 2017; v1 submitted 21 March, 2017;
originally announced March 2017.
-
Tight error bounds for rank-1 lattice sampling in spaces of hybrid mixed smoothness
Authors:
Glenn Byrenheid,
Lutz Kämmerer,
Tino Ullrich,
Toni Volkmer
Abstract:
We consider the approximate recovery of multivariate periodic functions from a discrete set of function values taken on a rank-$s$ integration lattice. The main result is the fact that any (non-)linear reconstruction algorithm taking function values on a rank-$s$ lattice of size $M$ has a dimension-independent lower bound of $2^{-(α+1)/2} M^{-α/2}$ when considering the optimal worst-case error wit…
▽ More
We consider the approximate recovery of multivariate periodic functions from a discrete set of function values taken on a rank-$s$ integration lattice. The main result is the fact that any (non-)linear reconstruction algorithm taking function values on a rank-$s$ lattice of size $M$ has a dimension-independent lower bound of $2^{-(α+1)/2} M^{-α/2}$ when considering the optimal worst-case error with respect to function spaces of (hybrid) mixed smoothness $α>0$ on the $d$-torus. We complement this lower bound with upper bounds that coincide up to logarithmic terms. These upper bounds are obtained by a detailed analysis of a rank-1 lattice sampling strategy, where the rank-1 lattices are constructed by a component-by-component (CBC) method. This improves on earlier results obtained in [25] and [27]. The lattice (group) structure allows for an efficient approximation of the underlying function from its sampled values using a single one-dimensional fast Fourier transform. This is one reason why these algorithms keep attracting significant interest. We compare our results to recent (almost) optimal methods based upon samples on sparse grids.
△ Less
Submitted 1 August, 2016; v1 submitted 28 October, 2015;
originally announced October 2015.