-
Equivariant Neural Tangent Kernels
Authors:
Philipp Misof,
Pan Kessel,
Jan E. Gerken
Abstract:
Equivariant neural networks have in recent years become an important technique for guiding architecture selection for neural networks with many applications in domains ranging from medical image analysis to quantum chemistry. In particular, as the most general linear equivariant layers with respect to the regular representation, group convolutions have been highly impactful in numerous application…
▽ More
Equivariant neural networks have in recent years become an important technique for guiding architecture selection for neural networks with many applications in domains ranging from medical image analysis to quantum chemistry. In particular, as the most general linear equivariant layers with respect to the regular representation, group convolutions have been highly impactful in numerous applications. Although equivariant architectures have been studied extensively, much less is known about the training dynamics of equivariant neural networks. Concurrently, neural tangent kernels (NTKs) have emerged as a powerful tool to analytically understand the training dynamics of wide neural networks. In this work, we combine these two fields for the first time by giving explicit expressions for NTKs of group convolutional neural networks. In numerical experiments, we demonstrate superior performance for equivariant NTKs over non-equivariant NTKs on a classification task for medical images.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Emergent Equivariance in Deep Ensembles
Authors:
Jan E. Gerken,
Pan Kessel
Abstract:
We show that deep ensembles become equivariant for all inputs and at all training times by simply using data augmentation. Crucially, equivariance holds off-manifold and for any architecture in the infinite width limit. The equivariance is emergent in the sense that predictions of individual ensemble members are not equivariant but their collective prediction is. Neural tangent kernel theory is us…
▽ More
We show that deep ensembles become equivariant for all inputs and at all training times by simply using data augmentation. Crucially, equivariance holds off-manifold and for any architecture in the infinite width limit. The equivariance is emergent in the sense that predictions of individual ensemble members are not equivariant but their collective prediction is. Neural tangent kernel theory is used to derive this result and we verify our theoretical insights using detailed numerical experiments.
△ Less
Submitted 15 June, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
HEAL-SWIN: A Vision Transformer On The Sphere
Authors:
Oscar Carlsson,
Jan E. Gerken,
Hampus Linander,
Heiner Spieß,
Fredrik Ohlsson,
Christoffer Petersson,
Daniel Persson
Abstract:
High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the…
▽ More
High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchical Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, enabling the network to process spherical representations with minimal computational overhead. We demonstrate the superior performance of our model on both synthetic and real automotive datasets, as well as a selection of other image datasets, for semantic segmentation, depth regression and classification tasks. Our code is publicly available at https://github.com/JanEGerken/HEAL-SWIN.
△ Less
Submitted 8 May, 2024; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Diffeomorphic Counterfactuals with Generative Models
Authors:
Ann-Kathrin Dombrowski,
Jan E. Gerken,
Klaus-Robert Müller,
Pan Kessel
Abstract:
Counterfactuals can explain classification decisions of neural networks in a human interpretable way. We propose a simple but effective method to generate such counterfactuals. More specifically, we perform a suitable diffeomorphic coordinate transformation and then perform gradient ascent in these coordinates to find counterfactuals which are classified with great confidence as a specified target…
▽ More
Counterfactuals can explain classification decisions of neural networks in a human interpretable way. We propose a simple but effective method to generate such counterfactuals. More specifically, we perform a suitable diffeomorphic coordinate transformation and then perform gradient ascent in these coordinates to find counterfactuals which are classified with great confidence as a specified target class. We propose two methods to leverage generative models to construct such suitable coordinate systems that are either exactly or approximately diffeomorphic. We analyze the generation process theoretically using Riemannian differential geometry and validate the quality of the generated counterfactuals using various qualitative and quantitative measures.
△ Less
Submitted 16 June, 2022; v1 submitted 10 June, 2022;
originally announced June 2022.
-
Equivariance versus Augmentation for Spherical Images
Authors:
Jan E. Gerken,
Oscar Carlsson,
Hampus Linander,
Fredrik Ohlsson,
Christoffer Petersson,
Daniel Persson
Abstract:
We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation. The chosen architectures can be considered baseline references for the respective design paradigms. Our models are tr…
▽ More
We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation. The chosen architectures can be considered baseline references for the respective design paradigms. Our models are trained and evaluated on single or multiple items from the MNIST or FashionMNIST dataset projected onto the sphere. For the task of image classification, which is inherently rotationally invariant, we find that by considerably increasing the amount of data augmentation and the size of the networks, it is possible for the standard CNNs to reach at least the same performance as the equivariant network. In contrast, for the inherently equivariant task of semantic segmentation, the non-equivariant networks are consistently outperformed by the equivariant networks with significantly fewer parameters. We also analyze and compare the inference latency and training times of the different networks, enabling detailed tradeoff considerations between equivariant architectures and data augmentation for practical problems. The equivariant spherical networks used in the experiments are available at https://github.com/JanEGerken/sem_seg_s2cnn .
△ Less
Submitted 12 July, 2022; v1 submitted 8 February, 2022;
originally announced February 2022.
-
Geometric Deep Learning and Equivariant Neural Networks
Authors:
Jan E. Gerken,
Jimmy Aronsson,
Oscar Carlsson,
Hampus Linander,
Fredrik Ohlsson,
Christoffer Petersson,
Daniel Persson
Abstract:
We survey the mathematical foundations of geometric deep learning, focusing on group equivariant and gauge equivariant neural networks. We develop gauge equivariant convolutional neural networks on arbitrary manifolds $\mathcal{M}$ using principal bundles with structure group $K$ and equivariant maps between sections of associated vector bundles. We also discuss group equivariant neural networks f…
▽ More
We survey the mathematical foundations of geometric deep learning, focusing on group equivariant and gauge equivariant neural networks. We develop gauge equivariant convolutional neural networks on arbitrary manifolds $\mathcal{M}$ using principal bundles with structure group $K$ and equivariant maps between sections of associated vector bundles. We also discuss group equivariant neural networks for homogeneous spaces $\mathcal{M}=G/K$, which are instead equivariant with respect to the global symmetry $G$ on $\mathcal{M}$. Group equivariant layers can be interpreted as intertwiners between induced representations of $G$, and we show their relation to gauge equivariant convolutional layers. We analyze several applications of this formalism, including semantic segmentation and object detection networks. We also discuss the case of spherical networks in great detail, corresponding to the case $\mathcal{M}=S^2=\mathrm{SO}(3)/\mathrm{SO}(2)$. Here we emphasize the use of Fourier analysis involving Wigner matrices, spherical harmonics and Clebsch-Gordan coefficients for $G=\mathrm{SO}(3)$, illustrating the power of representation theory for deep learning.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Modular Graph Forms and Scattering Amplitudes in String Theory
Authors:
Jan E. Gerken
Abstract:
In this thesis, we investigate the low-energy expansion of scattering amplitudes of closed strings at one-loop level (i.e. at genus one) in a ten-dimensional Minkowski background using a special class of functions called modular graph forms. These allow for a systematic evaluation of the low-energy expansion and satisfy many non-trivial algebraic and differential relations. We study these relation…
▽ More
In this thesis, we investigate the low-energy expansion of scattering amplitudes of closed strings at one-loop level (i.e. at genus one) in a ten-dimensional Minkowski background using a special class of functions called modular graph forms. These allow for a systematic evaluation of the low-energy expansion and satisfy many non-trivial algebraic and differential relations. We study these relations in detail, leading to basis decompositions for a large number of modular graph forms which greatly reduce the complexity of the expansions of the integrals appearing in the amplitude. One of the results of this thesis is a Mathematica package which automatizes these simplifications. We use these techniques to compute the leading low-energy orders of the scattering amplitude of four gluons in the heterotic string at one-loop level.
Furthermore, we study a generating function which conjecturally contains the torus integrals of all perturbative closed-string theories. We write this generating function in terms of iterated integrals of holomorphic Eisenstein series and use this approach to arrive at a more rigorous characterization of the space of modular graph forms than was possible before.
For tree-level string amplitudes, the single-valued map of multiple zeta values maps open-string amplitudes to closed-string amplitudes. The definition of a suitable one-loop generalization, a so-called elliptic single-valued map, is an active area of research and we provide a new perspective on this topic using our generating function of torus integrals.
The original version of this thesis, as submitted in June 2020 to the Humboldt University Berlin, is available under the DOI 10.18452/21829. The present text contains minor updates compared to this version, reflecting further developments in the literature, in particular concerning the construction of an elliptic single-valued map.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
Towards closed strings as single-valued open strings at genus one
Authors:
Jan E. Gerken,
Axel Kleinschmidt,
Carlos R. Mafra,
Oliver Schlotterer,
Bram Verbeek
Abstract:
We relate the low-energy expansions of world-sheet integrals in genus-one amplitudes of open- and closed-string states. The respective expansion coefficients are elliptic multiple zeta values in the open-string case and non-holomorphic modular forms dubbed "modular graph forms" for closed strings. By inspecting the differential equations and degeneration limits of suitable generating series of gen…
▽ More
We relate the low-energy expansions of world-sheet integrals in genus-one amplitudes of open- and closed-string states. The respective expansion coefficients are elliptic multiple zeta values in the open-string case and non-holomorphic modular forms dubbed "modular graph forms" for closed strings. By inspecting the differential equations and degeneration limits of suitable generating series of genus-one integrals, we identify formal substitution rules map** the elliptic multiple zeta values of open strings to the modular graph forms of closed strings. Based on the properties of these rules, we refer to them as an elliptic single-valued map which generalizes the genus-zero notion of a single-valued map acting on multiple zeta values seen in tree-level relations between the open and closed string.
△ Less
Submitted 14 June, 2021; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Basis Decompositions and a Mathematica Package for Modular Graph Forms
Authors:
Jan E. Gerken
Abstract:
Modular graph forms (MGFs) are a class of non-holomorphic modular forms which naturally appear in the low-energy expansion of closed-string genus-one amplitudes and have generated considerable interest from pure mathematicians. MGFs satisfy numerous non-trivial algebraic- and differential relations which have been studied extensively in the literature and lead to significant simplifications. In th…
▽ More
Modular graph forms (MGFs) are a class of non-holomorphic modular forms which naturally appear in the low-energy expansion of closed-string genus-one amplitudes and have generated considerable interest from pure mathematicians. MGFs satisfy numerous non-trivial algebraic- and differential relations which have been studied extensively in the literature and lead to significant simplifications. In this paper, we systematically combine these relations to obtain basis decompositions of all two- and three-point MGFs of total modular weight $w+\bar{w}\leq12$, starting from just two well-known identities for banana graphs. Furthermore, we study previously known relations in the integral representation of MGFs, leading to a new understanding of holomorphic subgraph reduction as Fay identities of Kronecker--Eisenstein series and opening the door towards decomposing divergent graphs. We provide a computer implementation for the manipulation of MGFs in the form of the $\texttt{Mathematica}$ package $\texttt{ModularGraphForms}$ which includes the basis decompositions obtained.
△ Less
Submitted 13 July, 2020; v1 submitted 10 July, 2020;
originally announced July 2020.
-
Generating series of all modular graph forms from iterated Eisenstein integrals
Authors:
Jan E. Gerken,
Axel Kleinschmidt,
Oliver Schlotterer
Abstract:
We study generating series of torus integrals that contain all so-called modular graph forms relevant for massless one-loop closed-string amplitudes. By analysing the differential equation of the generating series we construct a solution for its low-energy expansion to all orders in the inverse string tension $α'$. Our solution is expressed through initial data involving multiple zeta values and c…
▽ More
We study generating series of torus integrals that contain all so-called modular graph forms relevant for massless one-loop closed-string amplitudes. By analysing the differential equation of the generating series we construct a solution for its low-energy expansion to all orders in the inverse string tension $α'$. Our solution is expressed through initial data involving multiple zeta values and certain real-analytic functions of the modular parameter of the torus. These functions are built from real and imaginary parts of holomorphic iterated Eisenstein integrals and should be closely related to Brown's recent construction of real-analytic modular forms. We study the properties of our real-analytic objects in detail and give explicit examples to a fixed order in the $α'$-expansion. In particular, our solution allows for a counting of linearly independent modular graph forms at a given weight, confirming previous partial results and giving predictions for higher, hitherto unexplored weights. It also sheds new light on the topic of uniform transcendentality of the $α'$-expansion.
△ Less
Submitted 13 May, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
All-order differential equations for one-loop closed-string integrals and modular graph forms
Authors:
Jan E. Gerken,
Axel Kleinschmidt,
Oliver Schlotterer
Abstract:
We investigate generating functions for the integrals over world-sheet tori appearing in closed-string one-loop amplitudes of bosonic, heterotic and type-II theories. These closed-string integrals are shown to obey homogeneous and linear differential equations in the modular parameter of the torus. We spell out the first-order Cauchy-Riemann and second-order Laplace equations for the generating fu…
▽ More
We investigate generating functions for the integrals over world-sheet tori appearing in closed-string one-loop amplitudes of bosonic, heterotic and type-II theories. These closed-string integrals are shown to obey homogeneous and linear differential equations in the modular parameter of the torus. We spell out the first-order Cauchy-Riemann and second-order Laplace equations for the generating functions for any number of external states. The low-energy expansion of such torus integrals introduces infinite families of non-holomorphic modular forms known as modular graph forms. Our results generate homogeneous first- and second-order differential equations for arbitrary such modular graph forms and can be viewed as a step towards all-order low-energy expansions of closed-string integrals.
△ Less
Submitted 21 January, 2020; v1 submitted 8 November, 2019;
originally announced November 2019.
-
Heterotic-string amplitudes at one loop: modular graph forms and relations to open strings
Authors:
Jan E. Gerken,
Axel Kleinschmidt,
Oliver Schlotterer
Abstract:
We investigate one-loop four-point scattering of non-abelian gauge bosons in heterotic string theory and identify new connections with the corresponding open-string amplitude. In the low-energy expansion of the heterotic-string amplitude, the integrals over torus punctures are systematically evaluated in terms of modular graph forms, certain non-holomorphic modular forms. For a specific torus inte…
▽ More
We investigate one-loop four-point scattering of non-abelian gauge bosons in heterotic string theory and identify new connections with the corresponding open-string amplitude. In the low-energy expansion of the heterotic-string amplitude, the integrals over torus punctures are systematically evaluated in terms of modular graph forms, certain non-holomorphic modular forms. For a specific torus integral, the modular graph forms in the low-energy expansion are related to the elliptic multiple zeta values from the analogous open-string integrations over cylinder boundaries. The detailed correspondence between these modular graph forms and elliptic multiple zeta values supports a recent proposal for an elliptic generalization of the single-valued map at genus zero.
△ Less
Submitted 6 March, 2019; v1 submitted 6 November, 2018;
originally announced November 2018.
-
Holomorphic subgraph reduction of higher-point modular graph forms
Authors:
Jan E. Gerken,
Justin Kaidi
Abstract:
Modular graph forms are a class of modular covariant functions which appear in the genus-one contribution to the low-energy expansion of closed string scattering amplitudes. Modular graph forms with holomorphic subgraphs enjoy the simplifying property that they may be reduced to sums of products of modular graph forms of strictly lower loop order. In the particular case of dihedral modular graph f…
▽ More
Modular graph forms are a class of modular covariant functions which appear in the genus-one contribution to the low-energy expansion of closed string scattering amplitudes. Modular graph forms with holomorphic subgraphs enjoy the simplifying property that they may be reduced to sums of products of modular graph forms of strictly lower loop order. In the particular case of dihedral modular graph forms, a closed form expression for this holomorphic subgraph reduction was obtained previously by D'Hoker and Green. In the current work, we extend these results to trihedral modular graph forms. Doing so involves the identification of a modular covariant regularization scheme for certain conditionally convergent sums over discrete momenta, with some elements of the sum being excluded. The appropriate regularization scheme is identified for any number of exclusions, which in principle allows one to perform holomorphic subgraph reduction of higher-point modular graph forms with arbitrary holomorphic subgraphs.
△ Less
Submitted 16 February, 2019; v1 submitted 13 September, 2018;
originally announced September 2018.