Skip to main content

Showing 1–31 of 31 results for author: Kratsios, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02969  [pdf, other

    cs.LG cs.AI cs.CL q-fin.CP q-fin.MF

    Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models

    Authors: Raeid Saqur, Anastasis Kratsios, Florian Krach, Yannick Limmer, Jacob-Junqi Tian, John Willes, Blanka Horvath, Frank Rudzicz

    Abstract: We propose MoE-F -- a formalised mechanism for combining $N$ pre-trained expert Large Language Models (LLMs) in online time-series prediction tasks by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 29 pages, 5 Appendix sections

    MSC Class: 60J05; 60G35; 68T20; 68T42; 68T50 ACM Class: I.2.6; I.2.7; G.3

  2. arXiv:2405.20094  [pdf, other

    math.NA cs.LG cs.NE math.DG q-fin.CP

    Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

    Authors: Reza Arabpour, John Armstrong, Luca Galimberti, Anastasis Kratsios, Giulia Livieri

    Abstract: Predicting the conditional evolution of Volterra processes with stochastic volatility is a crucial challenge in mathematical finance. While deep neural network models offer promise in approximating the conditional law of such processes, their effectiveness is hindered by the curse of dimensionality caused by the infinite dimensionality and non-smooth nature of these problems. To address this, we p… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Main body: 25 Pages, Appendices 29 Pages, 14 Tables, 6 Figures

  3. arXiv:2405.16563  [pdf, other

    cs.LG cs.NE math.NA math.PR stat.ML

    Reality Only Happens Once: Single-Path Generalization Bounds for Transformers

    Authors: Yannick Limmer, Anastasis Kratsios, Xuwei Yang, Raeid Saqur, Blanka Horvath

    Abstract: One of the inherent challenges in deploying transformers on time series is that \emph{reality only happens once}; namely, one typically only has access to a single trajectory of the data-generating process comprised of non-i.i.d. observations. We derive non-asymptotic statistical guarantees in this setting through bounds on the \textit{generalization} of a transformer network at a future-time $t$,… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 11 pages (+30 appendix), 3 figures, 6 tables

    MSC Class: 60G35; 62M20; 68T07; 41A65

  4. arXiv:2404.09101  [pdf, ps, other

    cs.LG cs.AI math.NA stat.ML

    Mixture of Experts Soften the Curse of Dimensionality in Operator Learning

    Authors: Anastasis Kratsios, Takashi Furuya, Jose Antonio Lara Benitez, Matti Lassas, Maarten de Hoop

    Abstract: In this paper, we construct a mixture of neural operators (MoNOs) between function spaces whose complexity is distributed over a network of expert neural operators (NOs), with each NO satisfying parameter scaling restrictions. Our main result is a \textit{distributed} universal approximation theorem guaranteeing that any Lipschitz non-linear operator between $L^2([0,1]^d)$ spaces can be approximat… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  5. arXiv:2402.05576  [pdf, other

    cs.LG

    Tighter Generalization Bounds on Digital Computers via Discrete Optimal Transport

    Authors: Anastasis Kratsios, A. Martina Neuman, Gudmund Pammer

    Abstract: Machine learning models with inputs in a Euclidean space $\mathbb{R}^d$, when implemented on digital computers, generalize, and their {\it generalization gap} converges to $0$ at a rate of $c/N^{1/2}$ concerning the sample size $N$. However, the constant $c>0$ obtained through classical methods can be large in terms of the ambient dimension $d$ and the machine precision, posing a challenge when… ▽ More

    Submitted 14 April, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  6. arXiv:2402.03460  [pdf, other

    stat.ML cs.LG cs.NE math.CO math.NA

    Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts

    Authors: Anastasis Kratsios, Haitz Sáez de Ocáriz Borde, Takashi Furuya, Marc T. Law

    Abstract: Mixture-of-Experts (MoEs) can scale up beyond traditional deep learning models by employing a routing strategy in which each input is processed by a single "expert" deep learning model. This strategy allows us to scale up the number of parameters defining the MoE while maintaining sparse activation, i.e., MoEs only load a small number of their total parameters into GPU VRAM for the forward pass de… ▽ More

    Submitted 25 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  7. arXiv:2402.01297  [pdf, other

    cs.LG stat.ML

    Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum

    Authors: Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius

    Abstract: We derive new bounds for the condition number of kernel matrices, which we then use to enhance existing non-asymptotic test error bounds for kernel ridgeless regression (KRR) in the over-parameterized regime for a fixed input dimension. For kernels with polynomial spectral decay, we recover the bound from previous work; for exponential decay, our bound is non-trivial and novel. Our contribution is… ▽ More

    Submitted 29 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  8. arXiv:2310.19603  [pdf, other

    cs.LG cs.NE math.NA math.PR stat.ML

    Deep Kalman Filters Can Filter

    Authors: Blanka Hovart, Anastasis Kratsios, Yannick Limmer, Xuwei Yang

    Abstract: Deep Kalman filters (DKFs) are a class of neural network models that generate Gaussian probability measures from sequential data. Though DKFs are inspired by the Kalman filter, they lack concrete theoretical ties to the stochastic filtering problem, thus limiting their applicability to areas where traditional model-based filters have been used, e.g.\ model calibration for bond and option prices in… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    MSC Class: 60G35; 62M20; 68T07; 41A65

  9. arXiv:2310.15003  [pdf, other

    cs.LG cs.DM cs.NE math.MG

    Neural Snowflakes: Universal Latent Graph Inference via Trainable Latent Geometries

    Authors: Haitz Sáez de Ocáriz Borde, Anastasis Kratsios

    Abstract: The inductive bias of a graph neural network (GNN) is largely encoded in its specified graph. Latent graph inference relies on latent geometric representations to dynamically rewire or infer a GNN's graph to maximize the GNN's predictive downstream performance, but it lacks solid theoretical foundations in terms of embedding-based representation guarantees. This paper addresses this issue by intro… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 9 Pages + Appendix, 2 Figures, 9 Tables

  10. arXiv:2310.01105  [pdf, other

    cs.LG stat.ML

    Energy-Guided Continuous Entropic Barycenter Estimation for General Costs

    Authors: Alexander Kolesov, Petr Mokrov, Igor Udovichenko, Milena Gazdieva, Gudmund Pammer, Anastasis Kratsios, Evgeny Burnaev, Alexander Korotin

    Abstract: Optimal transport (OT) barycenters are a mathematically grounded way of averaging probability distributions while capturing their geometric properties. In short, the barycenter task is to take the average of a collection of probability distributions w.r.t. given OT discrepancies. We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functi… ▽ More

    Submitted 27 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  11. arXiv:2310.00987  [pdf, other

    cs.LG stat.ML

    A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge Regression

    Authors: Tin Sum Cheng, Aurelien Lucchi, Ivan Dokmanić, Anastasis Kratsios, David Belius

    Abstract: Existing statistical learning guarantees for general kernel regressors often yield loose bounds when used with finite-rank kernels. Yet, finite-rank kernels naturally appear in several machine learning problems, e.g.\ when fine-tuning a pre-trained deep neural network's last layer to adapt it to a novel task when performing transfer learning. We address this gap for finite-rank kernel ridge regres… ▽ More

    Submitted 3 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  12. arXiv:2309.04557  [pdf, other

    cs.LG math.DS math.OC q-fin.CP

    Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing

    Authors: Xuwei Yang, Anastasis Kratsios, Florian Krach, Matheus Grasselli, Aurelien Lucchi

    Abstract: We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets ${\cal D}_1,\dots,{\cal D}_N$ for the same learning model $f_θ$. Our objective is to minimize the cumulative deviation of the generated parameters $\{θ_i(t)\}_{t=0}^T$ across all $T$ iterations from the specialized parameters $θ^\star_{1},\ldots,θ^\star_N$ obtained for each datase… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 54 pages, 3 figures

  13. arXiv:2308.09250  [pdf, other

    cs.LG cs.DM cs.NE math.MG math.NA

    Capacity Bounds for Hyperbolic Neural Network Representations of Latent Tree Structures

    Authors: Anastasis Kratsios, Ruiyang Hong, Haitz Sáez de Ocáriz Borde

    Abstract: We study the representation capacity of deep hyperbolic neural networks (HNNs) with a ReLU activation function. We establish the first proof that HNNs can $\varepsilon$-isometrically embed any finite weighted tree into a hyperbolic space of dimension $d$ at least equal to $2$ with prescribed sectional curvature $κ<0$, for any $\varepsilon> 1$ (where $\varepsilon=1$ being optimal). We establish rig… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: 22 Pages + References, 1 Table, 4 Figures

    MSC Class: 68T07; 30L05; 68R12; 05C05

  14. arXiv:2304.12231  [pdf, other

    cs.LG cs.NE math.NA math.PR stat.ML

    An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning

    Authors: Anastasis Kratsios, Chong Liu, Matti Lassas, Maarten V. de Hoop, Ivan Dokmanić

    Abstract: Motivated by the develo** mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces $\mathcal{X}$ and $\mathcal{Y}$ using elementary functions between Euclidean spaces as building blocks. Earlier results assume that the target space $\mathcal{Y}$ is a topological vector space. We overcome this limitation by ``randomization'… ▽ More

    Submitted 24 July, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: 14 Figures, 3 Tables, 78 Pages (Main 40, Proofs 26, Acknowledgments and References 12)

    MSC Class: 41A65; 68T07; 60L50; 65N21; 46T99

  15. arXiv:2302.09176  [pdf, other

    q-fin.CP cs.LG cs.NE

    Generative Ornstein-Uhlenbeck Markets via Geometric Deep Learning

    Authors: Anastasis Kratsios, Cody Hyndman

    Abstract: We consider the problem of simultaneously approximating the conditional distribution of market prices and their log returns with a single machine learning model. We show that an instance of the GDN model of Kratsios and Papon (2022) solves this problem without having prior assumptions on the market's "clipped" log returns, other than that they follow a generalized Ornstein-Uhlenbeck process with a… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: 9 Pages, 1 Figure

    MSC Class: 68T07; 62M45; 91G60; 91G20

  16. arXiv:2301.11509  [pdf, other

    cs.LG math.NA stat.ML

    Out-of-distributional risk bounds for neural operators with applications to the Helmholtz equation

    Authors: J. Antonio Lara Benitez, Takashi Furuya, Florian Faucher, Anastasis Kratsios, Xavier Tricoche, Maarten V. de Hoop

    Abstract: Despite their remarkable success in approximating a wide range of operators defined by PDEs, existing neural operators (NOs) do not necessarily perform well for all physics problems. We focus here on high-frequency waves to highlight possible shortcomings. To resolve these, we propose a subfamily of NOs enabling an enhanced empirical approximation of the nonlinear operator map** wave speed to so… ▽ More

    Submitted 4 July, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

  17. arXiv:2211.01258  [pdf, other

    stat.ML cs.LG

    Instance-Dependent Generalization Bounds via Optimal Transport

    Authors: Songyan Hou, Parnian Kassraie, Anastasis Kratsios, Andreas Krause, Jonas Rothfuss

    Abstract: Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks. Since such bounds often hold uniformly over all parameters, they suffer from over-parametrization and fail to account for the strong inductive bias of initialization and stochastic gradient descent. As an alternative, we propose a novel optimal transport interpretation of the gen… ▽ More

    Submitted 13 November, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Journal of Machine Learning Research (JMLR), 51 pages

  18. arXiv:2210.13300  [pdf, other

    math.DS cs.LG q-fin.CP

    Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

    Authors: Luca Galimberti, Anastasis Kratsios, Giulia Livieri

    Abstract: Causal operators (CO), such as various solution operators to stochastic differential equations, play a central role in contemporary stochastic analysis; however, there is still no canonical framework for designing Deep Learning (DL) models capable of approximating COs. This paper proposes a "geometry-aware'" solution to this open problem by introducing a DL model-design framework that takes suitab… ▽ More

    Submitted 9 May, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  19. arXiv:2209.06788  [pdf, other

    cs.LG cs.NE math.CO math.MG stat.ML

    Small Transformers Compute Universal Metric Embeddings

    Authors: Anastasis Kratsios, Valentin Debarnot, Ivan Dokmanić

    Abstract: We study representations of data from an arbitrary metric space $\mathcal{X}$ in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}. Our guarantees are of memorization type: we prove that a probabilistic transformer of depth about… ▽ More

    Submitted 18 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: 42 pages, 10 Figures, 3 Tables

    MSC Class: 68T07; 30L05; 68R12; 68T30; 05C12

    Journal ref: Journal of Machine Learning Research 24 (2023): 1-48

  20. arXiv:2204.11231  [pdf, other

    cs.LG cs.AI cs.NE math.FA math.NA

    Do ReLU Networks Have An Edge When Approximating Compactly-Supported Functions?

    Authors: Anastasis Kratsios, Behnoosh Zamanlooy

    Abstract: We study the problem of approximating compactly-supported integrable functions while implementing their support set using feedforward neural networks. Our first main result transcribes this "structured" approximation problem into a universality problem. We do this by constructing a refinement of the usual topology on the space $L^1_{\operatorname{loc}}(\mathbb{R}^d,\mathbb{R}^D)$ of locally-integr… ▽ More

    Submitted 1 August, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

    Comments: 23 Pages: Main Text - 16 pages, Appendix - 7.5 pages, - Bibliography - 5 pages

    MSC Class: 68T07; 41A65; 46M40; 46M15

  21. arXiv:2201.13094  [pdf, other

    cs.LG cs.NE math.MG math.PR q-fin.CP

    Designing Universal Causal Deep Learning Models: The Geometric (Hyper)Transformer

    Authors: Beatrice Acciaio, Anastasis Kratsios, Gudmund Pammer

    Abstract: Several problems in stochastic analysis are defined through their geometry, and preserving that geometric structure is essential to generating meaningful predictions. Nevertheless, how to design principled deep learning (DL) models capable of encoding these geometric structures remains largely unknown. We address this open problem by introducing a universal causal geometric DL framework in which t… ▽ More

    Submitted 9 March, 2023; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: Main Body: 31 Pages, Proofs: 16 Pages, Figures: 13, Tables: 3

    MSC Class: 68T07; 49Q22; 41A65; 30L99; 60G25; 60H35

  22. arXiv:2110.03303  [pdf, other

    cs.LG cs.AI cs.NE math.FA math.MG

    Universal Approximation Under Constraints is Possible with Transformers

    Authors: Anastasis Kratsios, Behnoosh Zamanlooy, Tianlin Liu, Ivan Dokmanić

    Abstract: Many practical problems need the output of a machine learning model to satisfy a set of constraints, $K$. Nevertheless, there is no known guarantee that classical neural network architectures can exactly encode constraints while simultaneously achieving universality. We provide a quantitative constrained universal approximation theorem which guarantees that for any non-convex compact set $K$ and a… ▽ More

    Submitted 8 February, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 9.5 Pages + 14 Page Append + References, 3 Tables, 5 Figures

    MSC Class: 68T07; 41A65; 41A29; 51F99

    Journal ref: ICLR 2022 (Spotlight)

  23. arXiv:2105.07743  [pdf, other

    cs.LG cs.NE math.MG math.PR stat.ML

    Universal Regular Conditional Distributions

    Authors: Anastasis Kratsios

    Abstract: We introduce a deep learning model that can universally approximate regular conditional distributions (RCDs). The proposed model operates in three phases: first, it linearizes inputs from a given metric space $\mathcal{X}$ to $\mathbb{R}^d$ via a feature map, then a deep feedforward neural network processes these linearized features, and then the network's outputs are then transformed to the $1$-W… ▽ More

    Submitted 23 February, 2023; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: Regular Conditional Distributions, Geometric Deep Learning, Computational Optimal Transport, Measure-Valued Neural Networks, Universal Approximation, Transformers

    MSC Class: 68T07; 28A50; 49Q22; 54C65

  24. arXiv:2101.05390  [pdf, ps, other

    cs.LG math.FA math.GN math.GT

    Universal Approximation Theorems for Differentiable Geometric Deep Learning

    Authors: Anastasis Kratsios, Leonie Papon

    Abstract: This paper addresses the growing need to process non-Euclidean data, by introducing a geometric deep learning (GDL) framework for building universal feedforward-type models compatible with differentiable manifold geometries. We show that our GDL models can approximate any continuous target function uniformly on compact sets of a controlled maximum diameter. We obtain curvature-dependent lower-boun… ▽ More

    Submitted 25 July, 2022; v1 submitted 13 January, 2021; originally announced January 2021.

    Comments: Keywords: Geometric Deep Learning, Symmetric Positive-Definite Matrices, Hyperbolic Neural Networks, Deep Kalman Filter, Shape Space, Riemannian Manifolds, Curse of Dimensionality. Additional Information: 33 Pages + 30 Pages Appendix + Bibliography, 2 Tables, 7 Figures;

    MSC Class: 68T07; 46T99; 46T10; 54C35; 68T45; 41A65

  25. arXiv:2101.00041  [pdf, other

    cs.LG math.OC

    Optimizing Optimizers: Regret-optimal gradient descent algorithms

    Authors: Philippe Casgrain, Anastasis Kratsios

    Abstract: The need for fast and robust optimization algorithms are of critical importance in all areas of machine learning. This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm's performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for th… ▽ More

    Submitted 19 January, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: 12 pages body, 42 pages total, 2 figures

    MSC Class: 49K10; 49J10; 49M05; 49K27; 49J50; 65K10 ACM Class: F.2.0

  26. arXiv:2010.15571  [pdf, other

    cs.NE cs.LG stat.ML

    Learning Sub-Patterns in Piecewise Continuous Functions

    Authors: Anastasis Kratsios, Behnoosh Zamanlooy

    Abstract: Most stochastic gradient descent algorithms can optimize neural networks that are sub-differentiable in their parameters; however, this implies that the neural network's activation function must exhibit a degree of continuity which limits the neural network model's uniform approximation capacity to continuous functions. This paper focuses on the case where the discontinuities arise from distinct s… ▽ More

    Submitted 15 December, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

    Comments: 16 Pages + 7 Page Appendix, 9 Figures, and 6 Tables

    MSC Class: 68T07; 49J53; 05A18; 65Y05; 68W20 ACM Class: I.5.1; F.2.1

  27. arXiv:2006.14378  [pdf, ps, other

    cs.LG cs.NE math.FA stat.ML

    A Canonical Transform for Strengthening the Local $L^p$-Type Universal Approximation Property

    Authors: Anastasis Kratsios, Behnoosh Zamanlooy

    Abstract: Most $L^p$-type universal approximation theorems guarantee that a given machine learning model class $\mathscr{F}\subseteq C(\mathbb{R}^d,\mathbb{R}^D)$ is dense in $L^p_μ(\mathbb{R}^d,\mathbb{R}^D)$ for any suitable finite Borel measure $μ$ on $\mathbb{R}^d$. Unfortunately, this means that the model's approximation quality can rapidly degenerate outside some compact subset of $\mathbb{R}^d$, as a… ▽ More

    Submitted 9 June, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: 8 pages + 12 page appendix

    MSC Class: 41A65; 46M40; 46M20; 46E30; 46M15; 68T07 ACM Class: I.5.1; F.2.1

  28. arXiv:2006.02341  [pdf, ps, other

    cs.LG cs.NE math.DG math.GN stat.ML

    Non-Euclidean Universal Approximation

    Authors: Anastasis Kratsios, Eugene Bilokopytov

    Abstract: Modifications to a neural network's input and output layers are often required to accommodate the specificities of most practical learning tasks. However, the impact of such changes on architecture's approximation capabilities is largely not understood. We present general conditions describing feature and readout maps that preserve an architecture's ability to approximate any continuous functions… ▽ More

    Submitted 7 November, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

    Comments: 21 Pages

    MSC Class: 8T07; 68T05; 41A65; 46T99; 46T10; 54C35 ACM Class: I.2.6

    Journal ref: 33rd Conference on Neural Information Processing Systems (NeurIPS 2020)

  29. arXiv:2004.13612  [pdf, other

    stat.ML cs.LG math.OC q-fin.CP

    Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices

    Authors: Calypso Herrera, Florian Krach, Anastasis Kratsios, Pierre Ruyssen, Josef Teichmann

    Abstract: The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneousl… ▽ More

    Submitted 6 June, 2023; v1 submitted 28 April, 2020; originally announced April 2020.

    Journal ref: Transactions on Machine Learning Research (2023)

  30. arXiv:1910.03344  [pdf, ps, other

    stat.ML cs.LG math.DS

    The Universal Approximation Property

    Authors: Anastasis Kratsios

    Abstract: The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models' potential. This paper works towards overcoming these challenges by presenting a characterization, a representation, a constructio… ▽ More

    Submitted 28 November, 2020; v1 submitted 8 October, 2019; originally announced October 2019.

    MSC Class: 68T07; 47B33; 47A16; 68T05; 30L05; 46M40; 47B33

    Journal ref: Annals of Mathematics and Artificial Intelligence, 2020

  31. arXiv:1809.00082  [pdf, other

    stat.ML cs.LG math.NA math.PR q-fin.CP

    NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation

    Authors: Anastasis Kratsios, Cody Hyndman

    Abstract: Effective feature representation is key to the predictive performance of any algorithm. This paper introduces a meta-procedure, called Non-Euclidean Upgrading (NEU), which learns feature maps that are expressive enough to embed the universal approximation property (UAP) into most model classes while only outputting feature maps that preserve any model class's UAP. We show that NEU can learn any fe… ▽ More

    Submitted 10 May, 2021; v1 submitted 31 August, 2018; originally announced September 2018.

    Comments: 28 pages: main body, 24 pages: appendix, 8 Figures, 11 Tables

    MSC Class: 68T07; 68T30; 65D15; 91G80; 62H25; 62G08; 57-08

    Journal ref: Journal of Machine Learning Research (JMLR), Volume: 22; 2021