Search | arXiv e-print repository

Observation-specific explanations through scattered data approximation

Authors: Valentina Ghidini, Michael Multerer, Jacopo Quizi, Rohan Sen

Abstract: This work introduces the definition of observation-specific explanations to assign a score to each data point proportional to its importance in the definition of the prediction process. Such explanations involve the identification of the most influential observations for the black-box model of interest. The proposed method involves estimating these explanations by constructing a surrogate model th… ▽ More This work introduces the definition of observation-specific explanations to assign a score to each data point proportional to its importance in the definition of the prediction process. Such explanations involve the identification of the most influential observations for the black-box model of interest. The proposed method involves estimating these explanations by constructing a surrogate model through scattered data approximation utilizing the orthogonal matching pursuit algorithm. The proposed approach is validated on both simulated and real-world datasets. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2307.03927 [pdf, other]

Fast Empirical Scenarios

Authors: Michael Multerer, Paul Schneider, Rohan Sen

Abstract: We seek to extract a small number of representative scenarios from large and high-dimensional panel data that are consistent with sample moments. Among two novel algorithms, the first identifies scenarios that have not been observed before, and comes with a scenario-based representation of covariance matrices. The second proposal picks important data points from states of the world that have alrea… ▽ More We seek to extract a small number of representative scenarios from large and high-dimensional panel data that are consistent with sample moments. Among two novel algorithms, the first identifies scenarios that have not been observed before, and comes with a scenario-based representation of covariance matrices. The second proposal picks important data points from states of the world that have already realized, and are consistent with higher-order sample moment information. Both algorithms are efficient to compute, and lend themselves to consistent scenario-based modeling and high-dimensional numerical integration. Extensive numerical benchmarking studies and an application in portfolio optimization favor the proposed algorithms. △ Less

Submitted 5 February, 2024; v1 submitted 8 July, 2023; originally announced July 2023.

Comments: 22 pages, 7 figures

MSC Class: 11C20; 41A55; 46E22; 46N30; 60-08; 68W25

arXiv:2306.10180 [pdf, other]

Samplet basis pursuit: Multiresolution scattered data approximation with sparsity constraints

Authors: Davide Baroli, Helmut Harbrecht, Michael Multerer

Abstract: We consider scattered data approximation in samplet coordinates with $\ell_1$-regularization. The application of an $\ell_1$-regularization term enforces sparsity of the coefficients with respect to the samplet basis. Samplets are wavelet-type signed measures, which are tailored to scattered data. Therefore, samplets enable the use of well-established multiresolution techniques on general scattere… ▽ More We consider scattered data approximation in samplet coordinates with $\ell_1$-regularization. The application of an $\ell_1$-regularization term enforces sparsity of the coefficients with respect to the samplet basis. Samplets are wavelet-type signed measures, which are tailored to scattered data. Therefore, samplets enable the use of well-established multiresolution techniques on general scattered data sets. They provide similar properties as wavelets in terms of localization, multiresolution analysis, and data compression. By using the Riesz isometry, we embed samplets into reproducing kernel Hilbert spaces and discuss the properties of the resulting functions. We argue that the class of signals that are sparse with respect to the embedded samplet basis is considerably larger than the class of signals that are sparse with respect to the basis of kernel translates. Vice versa, every signal that is a linear combination of only a few kernel translates is sparse in samplet coordinates. We propose the rapid solution of the problem under consideration by combining soft-shrinkage with the semi-smooth Newton method. Leveraging on the sparse representation of kernel matrices in samplet coordinates, this approach converges faster than the fast iterative shrinkage thresholding algorithm and is feasible for large-scale data. Numerical benchmarks are presented and demonstrate the superiority of the multiresolution approach over the single-scale approach. As large-scale applications, the surface reconstruction from scattered data and the reconstruction of scattered temperature data using a dictionary of multiple kernels are considered. △ Less

Submitted 2 April, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

arXiv:2211.11681 [pdf, other]

Multiresolution kernel matrix algebra

Authors: H. Harbrecht, M. Multerer, O. Schenk, Ch. Schwab

Abstract: We propose a sparse algebra for samplet compressed kernel matrices, to enable efficient scattered data analysis. We show the compression of kernel matrices by means of samplets produces optimally sparse matrices in a certain S-format. It can be performed in cost and memory that scale essentially linearly with the matrix size $N$, for kernels of finite differentiability, along with addition and mul… ▽ More We propose a sparse algebra for samplet compressed kernel matrices, to enable efficient scattered data analysis. We show the compression of kernel matrices by means of samplets produces optimally sparse matrices in a certain S-format. It can be performed in cost and memory that scale essentially linearly with the matrix size $N$, for kernels of finite differentiability, along with addition and multiplication of S-formatted matrices. We prove and exploit the fact that the inverse of a kernel matrix (if it exists) is compressible in the S-format as well. Selected inversion allows to directly compute the entries in the corresponding sparsity pattern. The S-formatted matrix operations enable the efficient, approximate computation of more complicated matrix functions such as ${\bm A}^α$ or $\exp({\bm A})$. The matrix algebra is justified mathematically by pseudo differential calculus. As an application, efficient Gaussian process learning algorithms for spatial statistics is considered. Numerical results are presented to illustrate and quantify our findings. △ Less

Submitted 3 May, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

arXiv:2210.14874 [pdf, other]

Anisotropic multiresolution analyses for deepfake detection

Authors: Wei Huang, Michelangelo Valsecchi, Michael Multerer

Abstract: Generative Adversarial Networks (GANs) have paved the path towards entirely new media generation capabilities at the forefront of image, video, and audio synthesis. However, they can also be misused and abused to fabricate elaborate lies, capable of stirring up the public debate. The threat posed by GANs has sparked the need to discern between genuine content and fabricated one. Previous studies h… ▽ More Generative Adversarial Networks (GANs) have paved the path towards entirely new media generation capabilities at the forefront of image, video, and audio synthesis. However, they can also be misused and abused to fabricate elaborate lies, capable of stirring up the public debate. The threat posed by GANs has sparked the need to discern between genuine content and fabricated one. Previous studies have tackled this task by using classical machine learning techniques, such as k-nearest neighbours and eigenfaces, which unfortunately did not prove very effective. Subsequent methods have focused on leveraging on frequency decompositions, i.e., discrete cosine transform, wavelets, and wavelet packets, to preprocess the input features for classifiers. However, existing approaches only rely on isotropic transformations. We argue that, since GANs primarily utilize isotropic convolutions to generate their output, they leave clear traces, their fingerprint, in the coefficient distribution on sub-bands extracted by anisotropic transformations. We employ the fully separable wavelet transform and multiwavelets to obtain the anisotropic features to feed to standard CNN classifiers. Lastly, we find the fully separable transform capable of improving the state-of-the-art. △ Less

Submitted 4 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

arXiv:2110.04829 [pdf, other]

Adaptive joint distribution learning

Authors: Damir Filipovic, Michael Multerer, Paul Schneider

Abstract: We develop a new framework for embedding joint probability distributions in tensor product reproducing kernel Hilbert spaces (RKHS). Our framework accommodates a low-dimensional, normalized and positive model of a Radon-Nikodym derivative, which we estimate from sample sizes of up to several million data points, alleviating the inherent limitations of RKHS modeling. Well-defined normalized and pos… ▽ More We develop a new framework for embedding joint probability distributions in tensor product reproducing kernel Hilbert spaces (RKHS). Our framework accommodates a low-dimensional, normalized and positive model of a Radon-Nikodym derivative, which we estimate from sample sizes of up to several million data points, alleviating the inherent limitations of RKHS modeling. Well-defined normalized and positive conditional distributions are natural by-products to our approach. The embedding is fast to compute and accommodates learning problems ranging from prediction to classification. Our theoretical findings are supplemented by favorable numerical results. △ Less

Submitted 10 January, 2024; v1 submitted 10 October, 2021; originally announced October 2021.

arXiv:2107.03337 [pdf, other]

Samplets: A new paradigm for data compression

Authors: Helmut Harbrecht, Michael Multerer

Abstract: In this article, we introduce the concept of samplets by transferring the construction of Tausch-White wavelets to the realm of data. This way we obtain a multilevel representation of discrete data which directly enables data compression, detection of singularities and adaptivity. Applying samplets to represent kernel matrices, as they arise in kernel based learning or Gaussian process regression,… ▽ More In this article, we introduce the concept of samplets by transferring the construction of Tausch-White wavelets to the realm of data. This way we obtain a multilevel representation of discrete data which directly enables data compression, detection of singularities and adaptivity. Applying samplets to represent kernel matrices, as they arise in kernel based learning or Gaussian process regression, we end up with quasi-sparse matrices. By thresholding small entries, these matrices are compressible to O(N log N) relevant entries, where N is the number of data points. This feature allows for the use of fill-in reducing reorderings to obtain a sparse factorization of the compressed matrices. Besides the comprehensive introduction to samplets and their properties, we present extensive numerical studies to benchmark the approach. Our results demonstrate that samplets mark a considerable step in the direction of making large data sets accessible for analysis. △ Less

Submitted 16 November, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

arXiv:2105.02007 [pdf, other]

Space-time multilevel quadrature methods and their application for cardiac electrophysiology

Authors: Seif Ben Bader, Helmut Harbrecht, Rolf Krause, Michael Multerer, Alessio Quaglino, Marc Schmidlin

Abstract: We present a novel approach which aims at high-performance uncertainty quantification for cardiac electrophysiology simulations. Employing the monodomain equation to model the transmembrane potential inside the cardiac cells, we evaluate the effect of spatially correlated perturbations of the heart fibers on the statistics of the resulting quantities of interest. Our methodology relies on a close… ▽ More We present a novel approach which aims at high-performance uncertainty quantification for cardiac electrophysiology simulations. Employing the monodomain equation to model the transmembrane potential inside the cardiac cells, we evaluate the effect of spatially correlated perturbations of the heart fibers on the statistics of the resulting quantities of interest. Our methodology relies on a close integration of multilevel quadrature methods, parallel iterative solvers and space-time finite element discretizations, allowing for a fully parallelized framework in space, time and stochastics. Extensive numerical studies are presented to evaluate convergence rates and to compare the performance of classical Monte Carlo methods such as standard Monte Carlo (MC) and quasi-Monte Carlo (QMC), as well as multilevel strategies, i.e. multilevel Monte Carlo (MLMC) and multilevel quasi-Monte Carlo (MLQMC) on hierarchies of nested meshes. Finally, we employ a recently suggested variant of the multilevel approach for non-nested meshes to deal with a realistic heart geometry. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2102.09960 [pdf, other]

Fast and Accurate Uncertainty Quantification for the ECG with Random Electrodes Location

Authors: Michael Multerer, Simone Pezzuto

Abstract: The standard electrocardiogram (ECG) is a point-wise evaluation of the body potential at certain given locations. These locations are subject to uncertainty and may vary from patient to patient or even for a single patient. In this work, we estimate the uncertainty in the ECG induced by uncertain electrode positions when the ECG is derived from the forward bidomain model. In order to avoid the hig… ▽ More The standard electrocardiogram (ECG) is a point-wise evaluation of the body potential at certain given locations. These locations are subject to uncertainty and may vary from patient to patient or even for a single patient. In this work, we estimate the uncertainty in the ECG induced by uncertain electrode positions when the ECG is derived from the forward bidomain model. In order to avoid the high computational cost associated to the solution of the bidomain model in the entire torso, we propose a low-rank approach to solve the uncertainty quantification (UQ) problem. More precisely, we exploit the sparsity of the ECG and the lead field theory to translate it into a set of deterministic, time-independent problems, whose solution is eventually used to evaluate expectation and covariance of the ECG. We assess the approach with numerical experiments in a simple geometry. △ Less

Submitted 19 April, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

Comments: 12 pages, 4 figures

arXiv:2010.16104 [pdf, other]

Space-time shape uncertainties in the forward and inverse problem of electrocardiography

Authors: Lia Gander, Rolf Krause, Michael Multerer, Simone Pezzuto

Abstract: In electrocardiography, the "classic" inverse problem is the reconstruction of electric potentials at a surface enclosing the heart from remote recordings at the body surface and an accurate description of the anatomy. The latter being affected by noise and obtained with limited resolution due to clinical constraints, a possibly large uncertainty may be perpetuated in the inverse reconstruction.… ▽ More In electrocardiography, the "classic" inverse problem is the reconstruction of electric potentials at a surface enclosing the heart from remote recordings at the body surface and an accurate description of the anatomy. The latter being affected by noise and obtained with limited resolution due to clinical constraints, a possibly large uncertainty may be perpetuated in the inverse reconstruction. The purpose of this work is to study the effect of shape uncertainty on the forward and the inverse problem of electrocardiography. To this aim, the problem is first recast into a boundary integral formulation and then discretised with a collocation method to achieve high convergence rates and a fast time to solution. The shape uncertainty of the domain is represented by a random deformation field defined on a reference configuration. We propose a periodic-in-time covariance kernel for the random field and approximate the Karhunen-Loève expansion using low-rank techniques for fast sampling. The space-time uncertainty in the expected potential and its variance is evaluated with an anisotropic sparse quadrature approach and validated by a quasi-Monte Carlo method. We present several numerical experiments on a simplified but physiologically grounded 2-dimensional geometry to illustrate the validity of the approach. The tested parametric dimension ranged from 100 up to 600. For the forward problem the sparse quadrature is very effective. In the inverse problem, the sparse quadrature and the quasi-Monte Carlo method perform as expected, except for the total variation regularisation, where convergence is limited by lack of regularity. We finally investigate an $H^{1/2}$ regularisation, which naturally stems from the boundary integral formulation, and compare it to more classical approaches. △ Less

Submitted 28 June, 2021; v1 submitted 30 October, 2020; originally announced October 2020.

Comments: 23 pages, 6 figures, 1 table

arXiv:1906.00785 [pdf, other]

Bembel: The Fast Isogeometric Boundary Element C++ Library for Laplace, Helmholtz, and Electric Wave Equation

Authors: J. Dölz, H. Harbrecht, S. Kurz, M. Multerer, S. Schöps, F. Wolf

Abstract: In this article, we present Bembel, the C++ library featuring higher order isogeometric Galerkin boundary element methods for Laplace, Helmholtz, and Maxwell problems. Bembel is compatible with geometries from the Octave NURBS package and provides an interface to the Eigen template library for linear algebra operations. For computational efficiency, it applies an embedded fast multipole method tai… ▽ More In this article, we present Bembel, the C++ library featuring higher order isogeometric Galerkin boundary element methods for Laplace, Helmholtz, and Maxwell problems. Bembel is compatible with geometries from the Octave NURBS package and provides an interface to the Eigen template library for linear algebra operations. For computational efficiency, it applies an embedded fast multipole method tailored to the isogeometric analysis framework and a parallel matrix assembly based on OpenMP. △ Less

Submitted 3 June, 2019; originally announced June 2019.

Showing 1–11 of 11 results for author: Multerer, M