Skip to main content

Showing 1–13 of 13 results for author: Geiger, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2306.00091  [pdf, other

    stat.ML cs.LG hep-th

    A General Framework for Equivariant Neural Networks on Reductive Lie Groups

    Authors: Ilyes Batatia, Mario Geiger, Jose Munoz, Tess Smidt, Lior Silberman, Christoph Ortner

    Abstract: Reductive Lie Groups, such as the orthogonal groups, the Lorentz group, or the unitary groups, play essential roles across scientific fields as diverse as high energy physics, quantum mechanics, quantum chromodynamics, molecular dynamics, computer vision, and imaging. In this paper, we present a general Equivariant Neural Network architecture capable of respecting the symmetries of the finite-dime… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  2. arXiv:2106.02347  [pdf, other

    physics.chem-ph stat.ML

    SE(3)-equivariant prediction of molecular wavefunctions and electronic densities

    Authors: Oliver T. Unke, Mihail Bogojeski, Michael Gastegger, Mario Geiger, Tess Smidt, Klaus-Robert Müller

    Abstract: Machine learning has enabled the prediction of quantum chemical properties with high accuracy and efficiency, allowing to bypass computationally costly ab initio calculations. Instead of training on a fixed set of properties, more recent approaches attempt to learn the electronic wavefunction (or density) as a central quantity of atomistic systems, from which all other observables can be derived.… ▽ More

    Submitted 20 October, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

  3. arXiv:2008.08461  [pdf, other

    cs.LG physics.chem-ph physics.comp-ph stat.ML

    Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties

    Authors: Benjamin Kurt Miller, Mario Geiger, Tess E. Smidt, Frank Noé

    Abstract: Equivariant neural networks (ENNs) are graph neural networks embedded in $\mathbb{R}^3$ and are well suited for predicting molecular properties. The ENN library e3nn has customizable convolutions, which can be designed to depend only on distances between points, or also on angular features, making them rotationally invariant, or equivariant, respectively. This paper studies the practical value of… ▽ More

    Submitted 24 November, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

    Comments: Machine Learning for Molecules Workshop at NeurIPS 2020, NeurIPS workshop on Interpretable Inductive Biases and Physically Structured Learning

  4. Geometric compression of invariant manifolds in neural nets

    Authors: Jonas Paccolat, Leonardo Petrini, Mario Geiger, Kevin Tyloo, Matthieu Wyart

    Abstract: We study how neural networks compress uninformative input space in models where data lie in $d$ dimensions, but whose label only vary within a linear manifold of dimension $d_\parallel < d$. We show that for a one-hidden layer network initialized with infinitesimal weights (i.e. in the feature learning regime) trained with gradient descent, the first layer of weights evolve to become nearly insens… ▽ More

    Submitted 11 March, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

    Journal ref: Journal of Statistical Mechanics: Theory and Experiment, Volume 2021, April 2021

  5. Disentangling feature and lazy training in deep neural networks

    Authors: Mario Geiger, Stefano Spigler, Arthur Jacot, Matthieu Wyart

    Abstract: Two distinct limits for deep learning have been derived as the network width $h\rightarrow \infty$, depending on how the weights of the last layer scale with $h$. In the Neural Tangent Kernel (NTK) limit, the dynamics becomes linear in the weights and is described by a frozen kernel $Θ$. By contrast, in the Mean-Field limit, the dynamics can be expressed in terms of the distribution of the paramet… ▽ More

    Submitted 4 October, 2020; v1 submitted 19 June, 2019; originally announced June 2019.

    Comments: minor revisions

  6. arXiv:1905.10843  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm

    Authors: Stefano Spigler, Mario Geiger, Matthieu Wyart

    Abstract: How many training data are needed to learn a supervised task? It is often observed that the generalization error decreases as $n^{-β}$ where $n$ is the number of training examples and $β$ an exponent that depends on both data and algorithm. In this work we measure $β$ when applying kernel methods to real datasets. For MNIST we find $β\approx 0.4$ and for CIFAR10 $β\approx 0.1$, for both regression… ▽ More

    Submitted 18 August, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: We added (i) the prediction of the exponent $β$ for real data using kernel PCA; (ii) the generalization of our results to non-Gaussian data from reference [11] (Bordelon et al., "Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks")

  7. arXiv:1811.02017  [pdf, other

    cs.LG cs.AI cs.CG cs.CV stat.ML

    A General Theory of Equivariant CNNs on Homogeneous Spaces

    Authors: Taco Cohen, Mario Geiger, Maurice Weiler

    Abstract: We present a general theory of Group equivariant Convolutional Neural Networks (G-CNNs) on homogeneous spaces such as Euclidean space and the sphere. Feature maps in these networks represent fields on a homogeneous base space, and layers are equivariant maps between spaces of fields. The theory enables a systematic classification of all existing G-CNNs in terms of their symmetry group, base space,… ▽ More

    Submitted 9 January, 2020; v1 submitted 5 November, 2018; originally announced November 2018.

    Journal ref: Advances in Neural Information Processing Systems 32 (NeurIPS 2019) 9142-9153

  8. arXiv:1810.09665  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    A jamming transition from under- to over-parametrization affects loss landscape and generalization

    Authors: Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

    Abstract: We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to h… ▽ More

    Submitted 18 June, 2019; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: arXiv admin note: text overlap with arXiv:1809.09349

  9. arXiv:1807.04950  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Deep Learning in the Wild

    Authors: Thilo Stadelmann, Mohammadreza Amirian, Ismail Arabaci, Marek Arnold, Gilbert François Duivesteijn, Ismail Elezi, Melanie Geiger, Stefan Lörwald, Benjamin Bruno Meier, Katharina Rombach, Lukas Tuggener

    Abstract: Deep learning with neural networks is applied by an increasing number of people outside of classic research environments, due to the vast success of the methodology on a wide range of machine perception tasks. While this interest is fueled by beautiful success stories, practical work in deep learning on novel tasks without existing baselines remains challenging. This paper explores the specific ch… ▽ More

    Submitted 13 July, 2018; originally announced July 2018.

    Comments: Invited paper on ANNPR 2018

  10. arXiv:1807.02547  [pdf, other

    cs.LG stat.ML

    3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data

    Authors: Maurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, Taco Cohen

    Abstract: We present a convolutional network that is equivariant to rigid body motions. The model uses scalar-, vector-, and tensor fields over 3D Euclidean space to represent data, and equivariant convolutions to map between such representations. These SE(3)-equivariant convolutions utilize kernels which are parameterized as a linear combination of a complete steerable kernel basis, which is derived analyt… ▽ More

    Submitted 27 October, 2018; v1 submitted 6 July, 2018; originally announced July 2018.

  11. arXiv:1803.10743  [pdf, other

    cs.LG cs.CV stat.ML

    Intertwiners between Induced Representations (with Applications to the Theory of Equivariant Neural Networks)

    Authors: Taco S. Cohen, Mario Geiger, Maurice Weiler

    Abstract: Group equivariant and steerable convolutional neural networks (regular and steerable G-CNNs) have recently emerged as a very effective model class for learning from signal data such as 2D and 3D images, video, and other data where symmetries are present. In geometrical terms, regular G-CNNs represent data in terms of scalar fields ("feature channels"), whereas the steerable G-CNN can also use vect… ▽ More

    Submitted 30 March, 2018; v1 submitted 28 March, 2018; originally announced March 2018.

  12. arXiv:1803.06969  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Comparing Dynamics: Deep Neural Networks versus Glassy Systems

    Authors: M. Baity-Jesi, L. Sagun, M. Geiger, S. Spigler, G. Ben Arous, C. Cammarota, Y. LeCun, M. Wyart, G. Biroli

    Abstract: We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that dur… ▽ More

    Submitted 7 June, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

    Comments: 10 pages, 5 figures. Version accepted at ICML 2018

    Journal ref: PMLR 80:324-333, 2018; Republication with DOI (cite this one): J. Stat. Mech. (2019) 124013

  13. arXiv:1801.10130  [pdf, other

    cs.LG stat.ML

    Spherical CNNs

    Authors: Taco S. Cohen, Mario Geiger, Jonas Koehler, Max Welling

    Abstract: Convolutional Neural Networks (CNNs) have become the method of choice for learning problems involving 2D planar images. However, a number of problems of recent interest have created a demand for models that can analyze spherical images. Examples include omnidirectional vision for drones, robots, and autonomous cars, molecular regression problems, and global weather and climate modelling. A naive a… ▽ More

    Submitted 25 February, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

    Comments: Proceedings of the 6th International Conference on Learning Representations (ICLR), 2018

    Journal ref: Proceedings of the International Conference on Learning Representations, 2018