Search | arXiv e-print repository

On backpropagating Hessians through ODEs

Authors: Axel Ciceri, Thomas Fischbacher

Abstract: We discuss the problem of numerically backpropagating Hessians through ordinary differential equations (ODEs) in various contexts and elucidate how different approaches may be favourable in specific situations. We discuss both theoretical and pragmatic aspects such as, respectively, bounds on computational effort and typical impact of framework overhead. Focusing on the approach of hand-implemen… ▽ More We discuss the problem of numerically backpropagating Hessians through ordinary differential equations (ODEs) in various contexts and elucidate how different approaches may be favourable in specific situations. We discuss both theoretical and pragmatic aspects such as, respectively, bounds on computational effort and typical impact of framework overhead. Focusing on the approach of hand-implemented ODE-backpropagation, we develop the computation for the Hessian of orbit-nonclosure for a mechanical system. We also clarify the mathematical framework for extending the backward-ODE-evolution of the costate-equation to Hessians, in its most generic form. Some calculations, such as that of the Hessian for orbit non-closure, are performed in a language, defined in terms of a formal grammar, that we introduce to facilitate the tracking of intermediate quantities. As pedagogical examples, we discuss the Hessian of orbit-nonclosure for the higher dimensional harmonic oscillator and conceptually related problems in Newtonian gravitational theory. In particular, applying our approach to the figure-8 three-body orbit, we readily rediscover a distorted-figure-8 solution originally described by Simó. Possible applications may include: improvements to training of `neural ODE'- type deep learning with second-order methods, numerical analysis of quantum corrections around classical paths, and, more broadly, studying options for adjusting an ODE's initial configuration such that the impact on some given objective function is small. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: 32 pages, 3 figures, 1500 lines of code in ancillary files (including tests)

MSC Class: 93C73 ACM Class: G.1.6; G.1.7

arXiv:2008.05859 [pdf, other]

Single-Photon Image Classification

Authors: Thomas Fischbacher, Luciano Sbaiz

Abstract: Quantum computing-based machine learning mainly focuses on quantum computing hardware that is experimentally challenging to realize due to requiring quantum gates that operate at very low temperature. Instead, we demonstrate the existence of a lower performance and much lower effort island on the accuracy-vs-qubits graph that may well be experimentally accessible with room temperature optics. This… ▽ More Quantum computing-based machine learning mainly focuses on quantum computing hardware that is experimentally challenging to realize due to requiring quantum gates that operate at very low temperature. Instead, we demonstrate the existence of a lower performance and much lower effort island on the accuracy-vs-qubits graph that may well be experimentally accessible with room temperature optics. This high temperature "quantum computing toy model" is nevertheless interesting to study as it allows rather accessible explanations of key concepts in quantum computing, in particular interference, entanglement, and the measurement process. We specifically study the problem of classifying an example from the MNIST and Fashion-MNIST datasets, subject to the constraint that we have to make a prediction after the detection of the very first photon that passed a coherently illuminated filter showing the example. Whereas a classical set-up in which a photon is detected after falling on one of the $28\times 28$ image pixels is limited to a (maximum likelihood estimation) accuracy of $21.27\%$ for MNIST, respectively $18.27\%$ for Fashion-MNIST, we show that the theoretically achievable accuracy when exploiting inference by optically transforming the quantum state of the photon is at least $41.27\%$ for MNIST, respectively $36.14\%$ for Fashion-MNIST. We show in detail how to train the corresponding transformation with TensorFlow and also explain how this example can serve as a teaching tool for the measurement process in quantum mechanics. △ Less

Submitted 12 March, 2021; v1 submitted 13 August, 2020; originally announced August 2020.

Comments: See ancillary files for training code and pre-trained models

arXiv:2008.03936 [pdf, other]

Intelligent Matrix Exponentiation

Authors: Thomas Fischbacher, Iulia M. Comsa, Krzysztof Potempa, Moritz Firsching, Luca Versari, Jyrki Alakuijala

Abstract: We present a novel machine learning architecture that uses the exponential of a single input-dependent matrix as its only nonlinearity. The mathematical simplicity of this architecture allows a detailed analysis of its behaviour, providing robustness guarantees via Lipschitz bounds. Despite its simplicity, a single matrix exponential layer already provides universal approximation properties and ca… ▽ More We present a novel machine learning architecture that uses the exponential of a single input-dependent matrix as its only nonlinearity. The mathematical simplicity of this architecture allows a detailed analysis of its behaviour, providing robustness guarantees via Lipschitz bounds. Despite its simplicity, a single matrix exponential layer already provides universal approximation properties and can learn fundamental functions of the input, such as periodic functions or multivariate polynomials. This architecture outperforms other general-purpose architectures on benchmark problems, including CIFAR-10, using substantially fewer parameters. △ Less

Submitted 10 August, 2020; originally announced August 2020.

Comments: 20 pages, 10 figures

arXiv:1908.03565 [pdf]

Committee Draft of JPEG XL Image Coding System

Authors: Alexander Rhatushnyak, Jan Wassenberg, Jon Sneyers, Jyrki Alakuijala, Lode Vandevenne, Luca Versari, Robert Obryk, Zoltan Szabadka, Evgenii Kliuchnikov, Iulia-Maria Comsa, Krzysztof Potempa, Martin Bruse, Moritz Firsching, Renata Khasanova, Ruud van Asseldonk, Sami Boukortt, Sebastian Gomez, Thomas Fischbacher

Abstract: JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It provides various benefits compared to existing image formats: 60% size reduction at equivalent subjective quality; fast, parallelizable decoding and encoding configurations; features such as progressive, lossless, animation, and reversible transcoding of existing JPEG with 22%… ▽ More JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It provides various benefits compared to existing image formats: 60% size reduction at equivalent subjective quality; fast, parallelizable decoding and encoding configurations; features such as progressive, lossless, animation, and reversible transcoding of existing JPEG with 22% size reduction; support for high-quality applications including wide gamut, higher resolution/bit depth/dynamic range, and visually lossless coding. The JPEG XL architecture is traditional block-transform coding with upgrades to each component. △ Less

Submitted 13 August, 2019; v1 submitted 12 August, 2019; originally announced August 2019.

Comments: Royalty-free, open-source reference implementation in Q4 2019. v3 fixes PDF links and paper size

MSC Class: 94A08 ACM Class: I.4.2

arXiv:1907.13223 [pdf, other]

Temporal Coding in Spiking Neural Networks with Alpha Synaptic Function: Learning with Backpropagation

Authors: Iulia M. Comsa, Krzysztof Potempa, Luca Versari, Thomas Fischbacher, Andrea Gesmundo, Jyrki Alakuijala

Abstract: The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli. However, conventional artificial neural networks lack the intrinsic temporal coding ability present in biological networks. We propose a spiking neural network model that encodes information in the relative timing of individual neuron spikes. In classification tasks, the output of… ▽ More The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli. However, conventional artificial neural networks lack the intrinsic temporal coding ability present in biological networks. We propose a spiking neural network model that encodes information in the relative timing of individual neuron spikes. In classification tasks, the output of the network is indicated by the first neuron to spike in the output layer. This temporal coding scheme allows the supervised training of the network with backpropagation, using locally exact derivatives of the postsynaptic spike times with respect to presynaptic spike times. The network operates using a biologically-plausible alpha synaptic transfer function. Additionally, we use trainable synchronisation pulses that provide bias, add flexibility during training and exploit the decay part of the alpha function. We show that such networks can be trained successfully on noisy Boolean logic tasks and on the MNIST dataset encoded in time. The results show that the spiking neural network outperforms comparable spiking models on MNIST and achieves similar quality to fully connected conventional networks with the same architecture. We also find that the spiking network spontaneously discovers two operating regimes, mirroring the accuracy-speed trade-off observed in human decision-making: a slow regime, where a decision is taken after all hidden neurons have spiked and the accuracy is very high, and a fast regime, where a decision is taken very fast but the accuracy is lower. These results demonstrate the computational power of spiking networks with biological characteristics that encode information in the timing of individual neurons. By studying temporal coding in spiking networks, we aim to create building blocks towards energy-efficient and more complex biologically-inspired neural architectures. △ Less

Submitted 16 November, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

Comments: Open-source code related to this paper is available at https://github.com/google/ihmehimmeli v2: Added references and added some clarifications for the methods

arXiv:1906.00207 [pdf, other]

doi 10.1007/JHEP08(2019)057

SO(8) Supergravity and the Magic of Machine Learning

Authors: Iulia M. Comsa, Moritz Firsching, Thomas Fischbacher

Abstract: Using de Wit-Nicolai $D=4\;\mathcal{N}=8\;SO(8)$ supergravity as an example, we show how modern Machine Learning software libraries such as Google's TensorFlow can be employed to greatly simplify the analysis of high-dimensional scalar sectors of some M-Theory compactifications. We provide detailed information on the location, symmetries, and particle spectra and charges of 192 critical points o… ▽ More Using de Wit-Nicolai $D=4\;\mathcal{N}=8\;SO(8)$ supergravity as an example, we show how modern Machine Learning software libraries such as Google's TensorFlow can be employed to greatly simplify the analysis of high-dimensional scalar sectors of some M-Theory compactifications. We provide detailed information on the location, symmetries, and particle spectra and charges of 192 critical points on the scalar manifold of SO(8) supergravity, including one newly discovered $\mathcal{N}=1$ vacuum with $SO(3)$ residual symmetry, one new potentially stabilizable non-supersymmetric solution, and examples for "Galois conjugate pairs" of solutions, i.e. solution-pairs that share the same gauge group embedding into~$SO(8)$ and minimal polynomials for the cosmological constant. Where feasible, we give analytic expressions for solution coordinates and cosmological constants. As the authors' aspiration is to present the discussion in a form that is accessible to both the Machine Learning and String Theory communities and allows adopting our methods towards the study of other models, we provide an introductory overview over the relevant Physics as well as Machine Learning concepts. This includes short pedagogical code examples. In particular, we show how to formulate a requirement for residual Supersymmetry as a Machine Learning loss function and effectively guide the numerical search towards supersymmetric critical points. Numerical investigations suggest that there are no further supersymmetric vacua beyond this newly discovered fifth solution. △ Less

Submitted 19 July, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

Comments: 173 pages, 1 figure; v4 provides hyperlinkable individual PDF files for the Journal version (without appendix E) to refer to. Also fixes some typos and minor errors

MSC Class: 83E50

arXiv:0907.1587 [pdf, ps, other]

Continuum multi-physics modeling with scripting languages: the Nsim simulation compiler prototype for classical field theory

Authors: Thomas Fischbacher, Hans Fangohr

Abstract: We demonstrate that for a broad class of physical systems that can be described using classical field theory, automated runtime translation of the physical equations to parallelized finite-element numerical simulation code is feasible. This allows the implementation of multiphysics extension modules to popular scripting languages (such as Python) that handle the complete specification of the phy… ▽ More We demonstrate that for a broad class of physical systems that can be described using classical field theory, automated runtime translation of the physical equations to parallelized finite-element numerical simulation code is feasible. This allows the implementation of multiphysics extension modules to popular scripting languages (such as Python) that handle the complete specification of the physical system at script level. We discuss two example applications that utilize this framework: the micromagnetic simulation package "Nmag" as well as a short Python script to study morphogenesis in a reaction-diffusion model. △ Less

Submitted 9 July, 2009; originally announced July 2009.

Comments: 50 pages, 5 figures

arXiv:cs/0406002 [pdf, ps, other]

A novel approach to symbolic algebra

Authors: Thomas Fischbacher

Abstract: A prototype for an extensible interactive graphical term manipulation system is presented that combines pattern matching and nondeterministic evaluation to provide a convenient framework for doing tedious algebraic manipulations that so far had to be done manually in a semi-automatic fashion. A prototype for an extensible interactive graphical term manipulation system is presented that combines pattern matching and nondeterministic evaluation to provide a convenient framework for doing tedious algebraic manipulations that so far had to be done manually in a semi-automatic fashion. △ Less

Submitted 2 June, 2004; originally announced June 2004.

Comments: 15 pages

Report number: AEI-2004-043 ACM Class: G.4; I.1.3

arXiv:hep-th/0305176 [pdf, ps, other]

Map** the vacuum structure of gauged maximal supergravities: an application of high-performance symbolic algebra

Authors: Thomas Fischbacher

Abstract: The analysis of the extremal structure of the scalar potentials of gauged maximally extended supergravity models in five, four, and three dimensions, and hence the determination of possible vacuum states of these models is a computationally challenging task due to the occurrence of the exceptional Lie groups $E_6$, $E_7$, $E_8$ in the definition of these potentials. At present, the most promisin… ▽ More The analysis of the extremal structure of the scalar potentials of gauged maximally extended supergravity models in five, four, and three dimensions, and hence the determination of possible vacuum states of these models is a computationally challenging task due to the occurrence of the exceptional Lie groups $E_6$, $E_7$, $E_8$ in the definition of these potentials. At present, the most promising approach to gain information about nontrivial vacua of these models is to perform a truncation of the potential to submanifolds of the $G/H$ coset manifold of scalars which are invariant under a subgroup of the gauge group and of sufficiently low dimension to make an analytic treatment possible. New tools are presented which allow a systematic and highly effective study of these potentials up to a previously unreached level of complexity. Explicit forms of new truncations of the potentials of four- and three-dimensional models are given, and for N=16, D=3 supergravities, which are much more rich in structure than their higher-dimensional cousins, a series of new nontrivial vacua is identified and analysed. △ Less

Submitted 20 May, 2003; originally announced May 2003.

Comments: PhD thesis, 140 pages, 11 figures

Report number: AEI-2003-046

arXiv:hep-th/0208218 [pdf, ps, other]

Introducing LambdaTensor1.0 - A package for explicit symbolic and numeric Lie algebra and Lie group calculations

Authors: Thomas Fischbacher

Abstract: Due to the occurrence of large exceptional Lie groups in supergravity, calculations involving explicit Lie algebra and Lie group element manipulations easily become very complicated and hence also error-prone if done by hand. Research on the extremal structure of maximal gauged supergravity theories in various dimensions sparked the development of a library for efficient abstract multilinear alg… ▽ More Due to the occurrence of large exceptional Lie groups in supergravity, calculations involving explicit Lie algebra and Lie group element manipulations easily become very complicated and hence also error-prone if done by hand. Research on the extremal structure of maximal gauged supergravity theories in various dimensions sparked the development of a library for efficient abstract multilinear algebra calculations involving sparse and non-sparse higher-rank tensors, which is presented here. △ Less

Submitted 26 March, 2003; v1 submitted 29 August, 2002; originally announced August 2002.

Comments: 10 pages; the package's homepage is http://www.cip.physik.uni-muenchen.de/~tf/lambdatensor/; to be published in "Forschung und wissenschaftliches Rechnen - Beitraege zum Heinz-Billing-Preis 2002"; replacement reflects the corresponding release of version 1.1, which is described briefly in an addendum

Report number: AEI-2002-065

Showing 1–10 of 10 results for author: Fischbacher, T