Search | arXiv e-print repository

AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions

Authors: Esteban Real, Yao Chen, Mirko Rossini, Connal de Souza, Manav Garg, Akhil Verghese, Moritz Firsching, Quoc V. Le, Ekin Dogus Cubuk, David H. Park

Abstract: Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over the centuries by mathematicians, who emphasized the attainability of arbitrary precision. Computers, however, operate on few limited precision types, su… ▽ More Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over the centuries by mathematicians, who emphasized the attainability of arbitrary precision. Computers, however, operate on few limited precision types, such as the popular float32. In this study, we show that when aiming for limited precision, existing approximation methods can be outperformed by programs automatically discovered from scratch by a simple evolutionary algorithm. In particular, over real numbers, our method can approximate the exponential function reaching orders of magnitude more precision for a given number of operations when compared to previous approaches. More practically, over float32 numbers and constrained to less than 1 ULP of error, the same method attains a speedup over baselines by generating code that triggers better XLA/LLVM compilation paths. In other words, in both cases, evolution searched a vast space of possible programs, without knowledge of mathematics, to discover previously unknown optimized approximations to high precision, for the first time. We also give evidence that these results extend beyond the exponential. The ubiquity of transcendental functions suggests that our method has the potential to reduce the cost of scientific computing applications. △ Less

Submitted 13 December, 2023; originally announced December 2023.

ACM Class: I.2.2; I.2.6; G.1.2

arXiv:2008.03936 [pdf, other]

Intelligent Matrix Exponentiation

Authors: Thomas Fischbacher, Iulia M. Comsa, Krzysztof Potempa, Moritz Firsching, Luca Versari, Jyrki Alakuijala

Abstract: We present a novel machine learning architecture that uses the exponential of a single input-dependent matrix as its only nonlinearity. The mathematical simplicity of this architecture allows a detailed analysis of its behaviour, providing robustness guarantees via Lipschitz bounds. Despite its simplicity, a single matrix exponential layer already provides universal approximation properties and ca… ▽ More We present a novel machine learning architecture that uses the exponential of a single input-dependent matrix as its only nonlinearity. The mathematical simplicity of this architecture allows a detailed analysis of its behaviour, providing robustness guarantees via Lipschitz bounds. Despite its simplicity, a single matrix exponential layer already provides universal approximation properties and can learn fundamental functions of the input, such as periodic functions or multivariate polynomials. This architecture outperforms other general-purpose architectures on benchmark problems, including CIFAR-10, using substantially fewer parameters. △ Less

Submitted 10 August, 2020; originally announced August 2020.

Comments: 20 pages, 10 figures

arXiv:1908.03565 [pdf]

Committee Draft of JPEG XL Image Coding System

Authors: Alexander Rhatushnyak, Jan Wassenberg, Jon Sneyers, Jyrki Alakuijala, Lode Vandevenne, Luca Versari, Robert Obryk, Zoltan Szabadka, Evgenii Kliuchnikov, Iulia-Maria Comsa, Krzysztof Potempa, Martin Bruse, Moritz Firsching, Renata Khasanova, Ruud van Asseldonk, Sami Boukortt, Sebastian Gomez, Thomas Fischbacher

Abstract: JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It provides various benefits compared to existing image formats: 60% size reduction at equivalent subjective quality; fast, parallelizable decoding and encoding configurations; features such as progressive, lossless, animation, and reversible transcoding of existing JPEG with 22%… ▽ More JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It provides various benefits compared to existing image formats: 60% size reduction at equivalent subjective quality; fast, parallelizable decoding and encoding configurations; features such as progressive, lossless, animation, and reversible transcoding of existing JPEG with 22% size reduction; support for high-quality applications including wide gamut, higher resolution/bit depth/dynamic range, and visually lossless coding. The JPEG XL architecture is traditional block-transform coding with upgrades to each component. △ Less

Submitted 13 August, 2019; v1 submitted 12 August, 2019; originally announced August 2019.

Comments: Royalty-free, open-source reference implementation in Q4 2019. v3 fixes PDF links and paper size

MSC Class: 94A08 ACM Class: I.4.2

arXiv:1906.00207 [pdf, other]

doi 10.1007/JHEP08(2019)057

SO(8) Supergravity and the Magic of Machine Learning

Authors: Iulia M. Comsa, Moritz Firsching, Thomas Fischbacher

Abstract: Using de Wit-Nicolai $D=4\;\mathcal{N}=8\;SO(8)$ supergravity as an example, we show how modern Machine Learning software libraries such as Google's TensorFlow can be employed to greatly simplify the analysis of high-dimensional scalar sectors of some M-Theory compactifications. We provide detailed information on the location, symmetries, and particle spectra and charges of 192 critical points o… ▽ More Using de Wit-Nicolai $D=4\;\mathcal{N}=8\;SO(8)$ supergravity as an example, we show how modern Machine Learning software libraries such as Google's TensorFlow can be employed to greatly simplify the analysis of high-dimensional scalar sectors of some M-Theory compactifications. We provide detailed information on the location, symmetries, and particle spectra and charges of 192 critical points on the scalar manifold of SO(8) supergravity, including one newly discovered $\mathcal{N}=1$ vacuum with $SO(3)$ residual symmetry, one new potentially stabilizable non-supersymmetric solution, and examples for "Galois conjugate pairs" of solutions, i.e. solution-pairs that share the same gauge group embedding into~$SO(8)$ and minimal polynomials for the cosmological constant. Where feasible, we give analytic expressions for solution coordinates and cosmological constants. As the authors' aspiration is to present the discussion in a form that is accessible to both the Machine Learning and String Theory communities and allows adopting our methods towards the study of other models, we provide an introductory overview over the relevant Physics as well as Machine Learning concepts. This includes short pedagogical code examples. In particular, we show how to formulate a requirement for residual Supersymmetry as a Machine Learning loss function and effectively guide the numerical search towards supersymmetric critical points. Numerical investigations suggest that there are no further supersymmetric vacua beyond this newly discovered fifth solution. △ Less

Submitted 19 July, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

Comments: 173 pages, 1 figure; v4 provides hyperlinkable individual PDF files for the Journal version (without appendix E) to refer to. Also fixes some typos and minor errors

MSC Class: 83E50

arXiv:1407.0683 [pdf, ps, other]

doi 10.1080/10586458.2014.956374

Computing maximal copies of polytopes contained in a polytope

Authors: Moritz Firsching

Abstract: Kepler (1619) and Croft (1980) have considered largest homothetic copies of one regular polytope contained in another regular polytope. For arbitrary pairs of polytopes we propose to model this as a quadratically constrained optimization problem. These problems can then be solved numerically; in case the optimal solutions are algebraic, exact optima can be recovered by solving systems of equations… ▽ More Kepler (1619) and Croft (1980) have considered largest homothetic copies of one regular polytope contained in another regular polytope. For arbitrary pairs of polytopes we propose to model this as a quadratically constrained optimization problem. These problems can then be solved numerically; in case the optimal solutions are algebraic, exact optima can be recovered by solving systems of equations to very high precision and then using integer relation algorithms. Based on this approach, we complete Croft's solution to the problem concerning maximal inclusions of regular three-dimensional polyhedra by describing inclusions for the six remaining cases. △ Less

Submitted 2 July, 2014; originally announced July 2014.

Comments: 13 pages, 7 figures

MSC Class: 52C17; 51M20; 90C30

Journal ref: Experimental Mathematics Vol. 24 (2015), Issue 1, pp.98-105

Showing 1–5 of 5 results for author: Firsching, M