Search | arXiv e-print repository

Phonon predictions with E(3)-equivariant graph neural networks

Authors: Shiang Fang, Mario Geiger, Joseph G. Checkelsky, Tess Smidt

Abstract: We present an equivariant neural network for predicting vibrational and phonon modes of molecules and periodic crystals, respectively. These predictions are made by evaluating the second derivative Hessian matrices of the learned energy model that is trained with the energy and force data. Using this method, we are able to efficiently predict phonon dispersion and the density of states for inorgan… ▽ More We present an equivariant neural network for predicting vibrational and phonon modes of molecules and periodic crystals, respectively. These predictions are made by evaluating the second derivative Hessian matrices of the learned energy model that is trained with the energy and force data. Using this method, we are able to efficiently predict phonon dispersion and the density of states for inorganic crystal materials. For molecules, we also derive the symmetry constraints for IR/Raman active modes by analyzing the phonon mode irreducible representations. Additionally, we demonstrate that using Hessian as a new type of higher-order training data improves energy models beyond models that only use lower-order energy and force data. With this second derivative approach, one can directly relate the energy models to the experimental observations for the vibrational properties. This approach further connects to a broader class of physical observables with a generalized energy model that includes external fields. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: 4 figures

arXiv:2301.13703 [pdf, other]

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

Authors: Antonio Sclocchi, Mario Geiger, Matthieu Wyart

Abstract: Understanding when the noise in stochastic gradient descent (SGD) affects generalization of deep neural networks remains a challenge, complicated by the fact that networks can operate in distinct training regimes. Here we study how the magnitude of this noise $T$ affects performance as the size of the training set $P$ and the scale of initialization $α$ are varied. For gradient descent, $α$ is a k… ▽ More Understanding when the noise in stochastic gradient descent (SGD) affects generalization of deep neural networks remains a challenge, complicated by the fact that networks can operate in distinct training regimes. Here we study how the magnitude of this noise $T$ affects performance as the size of the training set $P$ and the scale of initialization $α$ are varied. For gradient descent, $α$ is a key parameter that controls if the network is `lazy'($α\gg1$) or instead learns features ($α\ll1$). For classification of MNIST and CIFAR10 images, our central results are: (i) obtaining phase diagrams for performance in the $(α,T)$ plane. They show that SGD noise can be detrimental or instead useful depending on the training regime. Moreover, although increasing $T$ or decreasing $α$ both allow the net to escape the lazy regime, these changes can have opposite effects on performance. (ii) Most importantly, we find that the characteristic temperature $T_c$ where the noise of SGD starts affecting the trained model (and eventually performance) is a power law of $P$. We relate this finding with the observation that key dynamical quantities, such as the total variation of weights during training, depend on both $T$ and $P$ as power laws. These results indicate that a key effect of SGD noise occurs late in training by affecting the stop** process whereby all data are fitted. Indeed, we argue that due to SGD noise, nets must develop a stronger `signal', i.e. larger informative weights, to fit the data, leading to a longer training time. A stronger signal and a longer training time are also required when the size of the training set $P$ increases. We confirm these views in the perceptron model, where signal and noise can be precisely measured. Interestingly, exponents characterizing the effect of SGD depend on the density of data near the decision boundary, as we explain. △ Less

Submitted 30 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: 25 pages, 21 figures, added analysis in feature-learning

arXiv:2201.03726 [pdf]

Cracking the Quantum Scaling Limit with Machine Learned Electron Densities

Authors: Joshua A. Rackers, Lucas Tecot, Mario Geiger, Tess E. Smidt

Abstract: A long-standing goal of science is to accurately solve the Schrödinger equation for large molecular systems. The poor scaling of current quantum chemistry algorithms on classical computers imposes an effective limit of about a few dozen atoms for which we can calculate molecular electronic structure. We present a machine learning (ML) method to break through this scaling limit and make quantum che… ▽ More A long-standing goal of science is to accurately solve the Schrödinger equation for large molecular systems. The poor scaling of current quantum chemistry algorithms on classical computers imposes an effective limit of about a few dozen atoms for which we can calculate molecular electronic structure. We present a machine learning (ML) method to break through this scaling limit and make quantum chemistry calculations of very large systems possible. We show that Euclidean Neural Networks can be trained to predict the electron density with high fidelity from limited data. Learning the electron density allows us to train a machine learning model on small systems and make accurate predictions on large ones. We show that this ML electron density model can break through the quantum scaling limit and calculate the electron density of systems of thousands of atoms with quantum accuracy. △ Less

Submitted 10 February, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

arXiv:2106.02893 [pdf]

Stern and Diffuse Layer Interactions During Ionic Strength Cycling

Authors: Emily Ma, Jeongmin Kim, HanByul Chang, Paul E. Ohno, Richard J. Jodts, Thomas F. Miller III, Franz M. Geiger

Abstract: Second harmonic generation amplitude and phase measurements are acquired in real time from fused silica:water interfaces that are subjected to ionic strength transitions conducted at pH 5.8. In conjunction with atomistic modeling, we identify correlations between structure in the Stern layer, encoded in the total second-order nonlinear susceptibility, chi(2)tot, and in the diffuse layer, encoded i… ▽ More Second harmonic generation amplitude and phase measurements are acquired in real time from fused silica:water interfaces that are subjected to ionic strength transitions conducted at pH 5.8. In conjunction with atomistic modeling, we identify correlations between structure in the Stern layer, encoded in the total second-order nonlinear susceptibility, chi(2)tot, and in the diffuse layer, encoded in the product of chi(2)tot and the total interfacial potential, phi(0)tot. chi(2)tot:phi(0)tot correlation plots indicate that the dynamics in the Stern and diffuse layers are decoupled from one another under some conditions (large change in ionic strength), while they change in lockstep under others (smaller change in ionic strength) as the ionic strength in the aqueous bulk solution varies. The quantitative structural and electrostatic information obtained also informs on the molecular origin of hysteresis in ionic strength cycling over fused silica. Atomistic simulations suggest a prominent role of contact ion pairs (as opposed to solvent-separated ion pairs) in the Stern layer. Those simulations also indicate that net water alignment is limited to the first 2 nm from the interface, even at 0 M ionic strength, highlighting water's polarization as an important contributor to nonlinear optical signal generation. △ Less

Submitted 5 June, 2021; originally announced June 2021.

Comments: Pe-edited version, 21 pages main text, 6 Figures, Supporting Information available upon request

arXiv:2104.02802 [pdf]

A New Imaginary Term in the 2nd Order Nonlinear Susceptibility from Charged Interfaces

Authors: Emily Ma, Paul E. Ohno, Jeongmin Kim, Yangdongling Dawning Liu, Emilie H. Lozier, Thomas F. Miller III, Hong-Fei Wang, Franz M. Geiger

Abstract: Non-resonant second harmonic generation phase and amplitude measurements obtained from the silica:water interface at varying pH and 0.5 M ionic strength point to the existence of a nonlinear susceptibility term, which we call chi(3)X, that is associated with a 90 deg phase shift. Including this contribution in a model for the total effective second-order nonlinear susceptibility produces reasonabl… ▽ More Non-resonant second harmonic generation phase and amplitude measurements obtained from the silica:water interface at varying pH and 0.5 M ionic strength point to the existence of a nonlinear susceptibility term, which we call chi(3)X, that is associated with a 90 deg phase shift. Including this contribution in a model for the total effective second-order nonlinear susceptibility produces reasonable point estimates for interfacial potentials and second-order nonlinear susceptibilities when chi(3)Xis about 1.5 times chi(3)water. A model without this term and containing only traditional chi(2) and chi(3) terms cannot recapitulate the experimental data. The new model also provides a demonstrated utility for distinguishing apparent differences in the second-order nonlinear susceptibility when the electrolyte is NaCl vs MgSO4, pointing to the possibility of using HD-SHG to investigate ion-specificity in interfacial processes. △ Less

Submitted 6 April, 2021; originally announced April 2021.

Comments: Pre-edited version, 16 Pages main text, 4 Figures, Supporting Information

Journal ref: J. Phys. Chem. Letters (2021)

arXiv:2101.03164 [pdf, other]

doi 10.1038/s41467-022-29939-5

E(3)-Equivariant Graph Neural Networks for Data-Efficient and Accurate Interatomic Potentials

Authors: Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, Boris Kozinsky

Abstract: This work presents Neural Equivariant Interatomic Potentials (NequIP), an E(3)-equivariant neural network approach for learning interatomic potentials from ab-initio calculations for molecular dynamics simulations. While most contemporary symmetry-aware models use invariant convolutions and only act on scalars, NequIP employs E(3)-equivariant convolutions for interactions of geometric tensors, res… ▽ More This work presents Neural Equivariant Interatomic Potentials (NequIP), an E(3)-equivariant neural network approach for learning interatomic potentials from ab-initio calculations for molecular dynamics simulations. While most contemporary symmetry-aware models use invariant convolutions and only act on scalars, NequIP employs E(3)-equivariant convolutions for interactions of geometric tensors, resulting in a more information-rich and faithful representation of atomic environments. The method achieves state-of-the-art accuracy on a challenging and diverse set of molecules and materials while exhibiting remarkable data efficiency. NequIP outperforms existing models with up to three orders of magnitude fewer training data, challenging the widely held belief that deep neural networks require massive training sets. The high data efficiency of the method allows for the construction of accurate potentials using high-order quantum chemical level of theory as reference and enables high-fidelity molecular dynamics simulations over long time scales. △ Less

Submitted 16 December, 2021; v1 submitted 8 January, 2021; originally announced January 2021.

arXiv:2009.01780 [pdf, other]

doi 10.1103/PhysRevLett.125.258301

Topological phase transition in coupled rock-paper-scissor cycles

Authors: Johannes Knebel, Philipp M. Geiger, Erwin Frey

Abstract: A hallmark of topological phases is the occurrence of topologically protected modes at the system`s boundary. Here we find topological phases in the antisymmetric Lotka-Volterra equation (ALVE). The ALVE is a nonlinear dynamical system and describes, e.g., the evolutionary dynamics of a rock-paper-scissors cycle. On a one-dimensional chain of rock-paper-scissor cycles, topological phases become ma… ▽ More A hallmark of topological phases is the occurrence of topologically protected modes at the system`s boundary. Here we find topological phases in the antisymmetric Lotka-Volterra equation (ALVE). The ALVE is a nonlinear dynamical system and describes, e.g., the evolutionary dynamics of a rock-paper-scissors cycle. On a one-dimensional chain of rock-paper-scissor cycles, topological phases become manifest as robust polarization states. At the transition point between left and right polarization, solitonic waves are observed. This topological phase transition lies in symmetry class $D$ within the "ten-fold way" classification as also realized by 1D topological superconductors. △ Less

Submitted 3 September, 2020; originally announced September 2020.

Journal ref: Phys. Rev. Lett. 125, 258301 (2020)

arXiv:2007.02005 [pdf, other]

doi 10.1103/PhysRevResearch.3.L012002

Finding Symmetry Breaking Order Parameters with Euclidean Neural Networks

Authors: Tess E. Smidt, Mario Geiger, Benjamin Kurt Miller

Abstract: Curie's principle states that "when effects show certain asymmetry, this asymmetry must be found in the causes that gave rise to them". We demonstrate that symmetry equivariant neural networks uphold Curie's principle and can be used to articulate many symmetry-relevant scientific questions into simple optimization problems. We prove these properties mathematically and demonstrate them numerically… ▽ More Curie's principle states that "when effects show certain asymmetry, this asymmetry must be found in the causes that gave rise to them". We demonstrate that symmetry equivariant neural networks uphold Curie's principle and can be used to articulate many symmetry-relevant scientific questions into simple optimization problems. We prove these properties mathematically and demonstrate them numerically by training a Euclidean symmetry equivariant neural network to learn symmetry-breaking input to deform a square into a rectangle and to generate octahedra tilting patterns in perovskites. △ Less

Submitted 26 October, 2020; v1 submitted 4 July, 2020; originally announced July 2020.

Comments: 6 pages, 3 figures

Journal ref: Phys. Rev. Research 3, 012002 (2021)

arXiv:1907.13170 [pdf]

doi 10.1073/pnas.1906601116

Energy Conversion via Metal Nanolayers

Authors: Mavis D. Boamah, Emilie H. Lozier, Jeongmin Kim, Paul E. Ohno, Catherine E. Walker, Thomas F. Miller III, Franz M. Geiger

Abstract: Current approaches for electric power generation from nanoscale conducting or semi-conducting layers in contact with moving aqueous droplets are promising as they show efficiencies of around 30 percent, yet, even the most successful ones pose challenges regarding fabrication and scaling. Here, we report stable, all-inorganic single-element structures synthesized in a single step that generate elec… ▽ More Current approaches for electric power generation from nanoscale conducting or semi-conducting layers in contact with moving aqueous droplets are promising as they show efficiencies of around 30 percent, yet, even the most successful ones pose challenges regarding fabrication and scaling. Here, we report stable, all-inorganic single-element structures synthesized in a single step that generate electrical current when alternating salinity gradients flow along its surface in a liquid flow cell. 10 nm to 30 nm thin nanolayers of iron, vanadium, or nickel produce several tens of mV and several microA cm^-2 at aqueous flow velocities of just a few cm s^-1. The principle of operation is strongly sensitive to charge-carrier motion in the thermal oxide nano-overlayer that forms spontaneously in air and then self terminates. Indeed, experiments suggest a role for intra-oxide electron transfer for Fe, V, and Ni nanolayers, as their thermal oxides contain several metal oxidation states, whereas controls using Al or Cr nanolayers, which self-terminate with oxides that are redox inactive under the experimental conditions, exhibit dramatically diminished performance. The nanolayers are shown to generate electrical current in various modes of application with moving liquids, including sliding liquid droplets, salinity gradients in a flowing liquid, and in the oscillatory motion of a liquid without a salinity gradient. △ Less

Submitted 30 July, 2019; originally announced July 2019.

Comments: Pre-edited final version, 16 pages main text, 5 figures

Journal ref: PNAS 2019

arXiv:1905.10843 [pdf, other]

doi 10.1088/1742-5468/abc61d

Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm

Authors: Stefano Spigler, Mario Geiger, Matthieu Wyart

Abstract: How many training data are needed to learn a supervised task? It is often observed that the generalization error decreases as $n^{-β}$ where $n$ is the number of training examples and $β$ an exponent that depends on both data and algorithm. In this work we measure $β$ when applying kernel methods to real datasets. For MNIST we find $β\approx 0.4$ and for CIFAR10 $β\approx 0.1$, for both regression… ▽ More How many training data are needed to learn a supervised task? It is often observed that the generalization error decreases as $n^{-β}$ where $n$ is the number of training examples and $β$ an exponent that depends on both data and algorithm. In this work we measure $β$ when applying kernel methods to real datasets. For MNIST we find $β\approx 0.4$ and for CIFAR10 $β\approx 0.1$, for both regression and classification tasks, and for Gaussian or Laplace kernels. To rationalize the existence of non-trivial exponents that can be independent of the specific kernel used, we study the Teacher-Student framework for kernels. In this scheme, a Teacher generates data according to a Gaussian random field, and a Student learns them via kernel regression. With a simplifying assumption -- namely that the data are sampled from a regular lattice -- we derive analytically $β$ for translation invariant kernels, using previous results from the kriging literature. Provided that the Student is not too sensitive to high frequencies, $β$ depends only on the smoothness and dimension of the training data. We confirm numerically that these predictions hold when the training points are sampled at random on a hypersphere. Overall, the test error is found to be controlled by the magnitude of the projection of the true function on the kernel eigenvectors whose rank is larger than $n$. Using this idea we predict relate the exponent $β$ to an exponent $a$ describing how the coefficients of the true function in the eigenbasis of the kernel decay with rank. We extract $a$ from real data by performing kernel PCA, leading to $β\approx0.36$ for MNIST and $β\approx0.07$ for CIFAR10, in good agreement with observations. We argue that these rather large exponents are possible due to the small effective dimension of the data. △ Less

Submitted 18 August, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

Comments: We added (i) the prediction of the exponent $β$ for real data using kernel PCA; (ii) the generalization of our results to non-Gaussian data from reference [11] (Bordelon et al., "Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks")

arXiv:1905.09912 [pdf, other]

doi 10.7554/eLife.51020

Stochastic Yield Catastrophes and Robustness in Self-Assembly

Authors: Florian M. Gartner, Isabella R. Graf, Patrick Wilke, Philipp M. Geiger, Erwin Frey

Abstract: A guiding principle in self-assembly is that, for high production yield, nucleation of structures must be significantly slower than their growth. However, details of the mechanism that impedes nucleation are broadly considered irrelevant. Here, we analyze self-assembly into finite-sized target structures employing mathematical modeling. We investigate two key scenarios to delay nucleation: (i) by… ▽ More A guiding principle in self-assembly is that, for high production yield, nucleation of structures must be significantly slower than their growth. However, details of the mechanism that impedes nucleation are broadly considered irrelevant. Here, we analyze self-assembly into finite-sized target structures employing mathematical modeling. We investigate two key scenarios to delay nucleation: (i) by introducing a slow activation step for the assembling constituents and, (ii) by decreasing the dimerization rate. These scenarios have widely different characteristics. While the dimerization scenario exhibits robust behavior, the activation scenario is highly sensitive to demographic fluctuations. These demographic fluctuations ultimately disfavor growth compared to nucleation and can suppress yield completely. The occurrence of this stochastic yield catastrophe does not depend on model details but is generic as soon as number fluctuations between constituents are taken into account. On a broader perspective, our results reveal that stochasticity is an important limiting factor for self-assembly and that the specific implementation of the nucleation process plays a significant role in determining the yield. △ Less

Submitted 18 March, 2020; v1 submitted 23 May, 2019; originally announced May 2019.

arXiv:1903.05707 [pdf]

Beyond the Gouy-Chapman Model with Heterodyne-Detected Second Harmonic Generation

Authors: Paul E. Ohno, HanByul Chang, Austin P. Spencer, Yangdongling Liu, Mavis D. Boamah, Hong-fei Wang, Franz M. Geiger

Abstract: We report ionic strength-dependent phase shifts in second harmonic generation (SHG) signals from charged interfaces that verify a recent model in which dispersion between the fundamental and second harmonic beams modulates observed signal intensities. We show how phase information can be used to unambiguously separate the chi(2) and interfacial potential-dependent chi(3) terms that contribute to t… ▽ More We report ionic strength-dependent phase shifts in second harmonic generation (SHG) signals from charged interfaces that verify a recent model in which dispersion between the fundamental and second harmonic beams modulates observed signal intensities. We show how phase information can be used to unambiguously separate the chi(2) and interfacial potential-dependent chi(3) terms that contribute to the total signal and provide a path to test primitive ion models and mean field theories for the electrical double layer with experiments to which theory must conform. Finally, we demonstrate the new method on supported lipid bilayers and comment on the ability of our new instrument to identify hyper-Rayleigh scattering contributions to common homodyne SHG measurements in reflection geometries. △ Less

Submitted 13 March, 2019; originally announced March 2019.

Comments: 21 manuscript pages, four figures, 10 pages supporting information included, pre-edited version

arXiv:1901.01608 [pdf, other]

doi 10.1088/1742-5468/ab633c

Scaling description of generalization with number of parameters in deep learning

Authors: Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stéphane d'Ascoli, Giulio Biroli, Clément Hongler, Matthieu Wyart

Abstract: Supervised deep learning involves the training of neural networks with a large number $N$ of parameters. For large enough $N$, in the so-called over-parametrized regime, one can essentially fit the training data points. Sparsity-based arguments would suggest that the generalization error increases as $N$ grows past a certain threshold $N^{*}$. Instead, empirical studies have shown that in the over… ▽ More Supervised deep learning involves the training of neural networks with a large number $N$ of parameters. For large enough $N$, in the so-called over-parametrized regime, one can essentially fit the training data points. Sparsity-based arguments would suggest that the generalization error increases as $N$ grows past a certain threshold $N^{*}$. Instead, empirical studies have shown that in the over-parametrized regime, generalization error keeps decreasing with $N$. We resolve this paradox through a new framework. We rely on the so-called Neural Tangent Kernel, which connects large neural nets to kernel methods, to show that the initialization causes finite-size random fluctuations $\|f_{N}-\bar{f}_{N}\|\sim N^{-1/4}$ of the neural net output function $f_{N}$ around its expectation $\bar{f}_{N}$. These affect the generalization error $ε_{N}$ for classification: under natural assumptions, it decays to a plateau value $ε_{\infty}$ in a power-law fashion $\sim N^{-1/2}$. This description breaks down at a so-called jamming transition $N=N^{*}$. At this threshold, we argue that $\|f_{N}\|$ diverges. This result leads to a plausible explanation for the cusp in test error known to occur at $N^{*}$. Our results are confirmed by extensive empirical observations on the MNIST and CIFAR image datasets. Our analysis finally suggests that, given a computational envelope, the smallest generalization error is obtained using several networks of intermediate sizes, just beyond $N^{*}$, and averaging their outputs. △ Less

Submitted 8 October, 2019; v1 submitted 6 January, 2019; originally announced January 2019.

Comments: The clarity of the text has been improved: the section "Related works" has been updated and the section "3.1 Regression task" has been added

arXiv:1810.09665 [pdf, other]

doi 10.1088/1751-8121/ab4c8b

A jamming transition from under- to over-parametrization affects loss landscape and generalization

Authors: Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

Abstract: We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to h… ▽ More We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to hamper minimization. Our findings support a link between this transition and the generalization properties of the network: as we increase the number of parameters of a given model, starting from an under-parametrized network, we observe that the generalization error displays three phases: (i) initial decay, (ii) increase until the transition point --- where it displays a cusp --- and (iii) slow decay toward a constant for the rest of the over-parametrized regime. Thereby we identify the region where the classical phenomenon of over-fitting takes place, and the region where the model keeps improving, in line with previous empirical observations for modern neural networks. △ Less

Submitted 18 June, 2019; v1 submitted 22 October, 2018; originally announced October 2018.

Comments: arXiv admin note: text overlap with arXiv:1809.09349

arXiv:1809.09349 [pdf, other]

doi 10.1103/PhysRevE.100.012115

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

Authors: Mario Geiger, Stefano Spigler, Stéphane d'Ascoli, Levent Sagun, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart

Abstract: Deep learning has been immensely successful at a variety of tasks, ranging from classification to AI. Learning corresponds to fitting training data, which is implemented by descending a very high-dimensional loss function. Understanding under which conditions neural networks do not get stuck in poor minima of the loss, and how the landscape of that loss evolves as depth is increased remains a chal… ▽ More Deep learning has been immensely successful at a variety of tasks, ranging from classification to AI. Learning corresponds to fitting training data, which is implemented by descending a very high-dimensional loss function. Understanding under which conditions neural networks do not get stuck in poor minima of the loss, and how the landscape of that loss evolves as depth is increased remains a challenge. Here we predict, and test empirically, an analogy between this landscape and the energy landscape of repulsive ellipses. We argue that in FC networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. In the vicinity of this transition, properties of the curvature of the minima of the loss are critical. This transition shares direct similarities with the jamming transition by which particles form a disordered solid as the density is increased, which also occurs in certain classes of computational optimization and learning problems such as the perceptron. Our analysis gives a simple explanation as to why poor minima of the loss cannot be encountered in the overparametrized regime, and puts forward the surprising result that the ability of fully connected networks to fit random data is independent of their depth. Our observations suggests that this independence also holds for real data. We also study a quantity $Δ$ which characterizes how well ($Δ<0$) or badly ($Δ>0$) a datum is learned. At the critical point it is power-law distributed, $P_+(Δ)\simΔ^θ$ for $Δ>0$ and $P_-(Δ)\sim(-Δ)^{-γ}$ for $Δ<0$, with $θ\approx0.3$ and $γ\approx0.2$. This observation suggests that near the transition the loss landscape has a hierarchical structure and that the learning dynamics is prone to avalanche-like dynamics, with abrupt changes in the set of patterns that are learned. △ Less

Submitted 17 June, 2019; v1 submitted 25 September, 2018; originally announced September 2018.

Journal ref: Phys. Rev. E 100, 012115 (2019)

arXiv:1809.04909 [pdf]

Dendritic Oxide Growth in Zero-Valent Iron Nanofilms Revealed by Atom Probe Tomography

Authors: Mavis D. Boamah, Dieter Isheim, Franz M. Geiger

Abstract: Atom probe tomography (APT) analysis of chemically pure nanofilms of zero-valent iron (Fe(0), or ZVI) and their thermal oxide nano-overlayers reveals the presence of dendritic iron oxide features that extend from the oxide nano-overlayer surface into the ZVI bulk. The dendrites are observed by APT to be in the 5 nm x 10 nm size range and form quickly under natural atmospheric conditions. Their gro… ▽ More Atom probe tomography (APT) analysis of chemically pure nanofilms of zero-valent iron (Fe(0), or ZVI) and their thermal oxide nano-overlayers reveals the presence of dendritic iron oxide features that extend from the oxide nano-overlayer surface into the ZVI bulk. The dendrites are observed by APT to be in the 5 nm x 10 nm size range and form quickly under natural atmospheric conditions. Their growth into the ZVI lalyer is, within the limit of our three-month long study, self-limiting (i.e. their initial growth appears to quickly discontinue). The atomistic views presented here shed first light on the atmospheric corrosion process of Fe(0)-bearing engineered nanostructures and their surfaces in the limit of low bulk impurities. Possible roles of the newly identified oxidized iron dendrites are also discussed in the context of passivation processes limiting technological applications of Fe(0). △ Less

Submitted 26 November, 2018; v1 submitted 13 September, 2018; originally announced September 2018.

Comments: Pre-edited version, 14 pages main manuscript text, 7 figures, supporting information

arXiv:1806.07339 [pdf, other]

doi 10.1103/PhysRevE.98.062316

Topologically robust zero-sum games and Pfaffian orientation -- How network topology determines the long-time dynamics of the antisymmetric Lotka-Volterra equation

Authors: Philipp M. Geiger, Johannes Knebel, Erwin Frey

Abstract: To explore how the topology of interaction networks determines the robustness of dynamical systems, we study the antisymmetric Lotka-Volterra equation (ALVE). The ALVE is the replicator equation of zero-sum games in evolutionary game theory, in which the strengths of pairwise interactions between strategies are defined by an antisymmetric matrix such that typically some strategies go extinct over… ▽ More To explore how the topology of interaction networks determines the robustness of dynamical systems, we study the antisymmetric Lotka-Volterra equation (ALVE). The ALVE is the replicator equation of zero-sum games in evolutionary game theory, in which the strengths of pairwise interactions between strategies are defined by an antisymmetric matrix such that typically some strategies go extinct over time. Here we show that there also exist topologically robust zero-sum games, such as the rock-paper-scissors game, for which all strategies coexist for all choices of interaction strengths. We refer to such zero-sum games as coexistence networks and construct coexistence networks with an arbitrary number of strategies. By map** the long-time dynamics of the ALVE to the algebra of antisymmetric matrices, we identify simple graph-theoretical rules by which coexistence networks are constructed. Examples are triangulations of cycles characterized by the golden ratio $\varphi = 1.6180...$, cycles with complete subnetworks, and non-Hamiltonian networks. In graph-theoretical terms, we extend the concept of a Pfaffian orientation from even-sized to odd-sized networks. Our results show that the topology of interaction networks alone can determine the long-time behavior of nonlinear dynamical systems, and may help to identify robust network motifs arising, for example, in ecology. △ Less

Submitted 19 June, 2018; originally announced June 2018.

Comments: 43 pages, 12 figures

Report number: LMU-ASC 38/18 LMU-ASC 38/18

Journal ref: Phys. Rev. E 98, 062316 (2018)

arXiv:1803.06969 [pdf, other]

doi 10.1088/1742-5468/ab3281

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

Authors: M. Baity-Jesi, L. Sagun, M. Geiger, S. Spigler, G. Ben Arous, C. Cammarota, Y. LeCun, M. Wyart, G. Biroli

Abstract: We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that dur… ▽ More We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that during the training process the dynamics slows down because of an increasingly large number of flat directions. At large times, when the loss is approaching zero, the system diffuses at the bottom of the landscape. Despite some similarities with the dynamics of mean-field glassy systems, in particular, the absence of barrier crossing, we find distinctive dynamical behaviors in the two cases, showing that the statistical properties of the corresponding loss and energy landscapes are different. In contrast, when the network is under-parametrized we observe a typical glassy behavior, thus suggesting the existence of different phases depending on whether the network is under-parametrized or over-parametrized. △ Less

Submitted 7 June, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

Comments: 10 pages, 5 figures. Version accepted at ICML 2018

Journal ref: PMLR 80:324-333, 2018; Republication with DOI (cite this one): J. Stat. Mech. (2019) 124013

arXiv:1803.01202 [pdf]

Hydrogen Bond Networks Near Supported Lipid Bilayers from Vibrational Sum Frequency Generation Experiments and Atomistic Simulations

Authors: Merve Dogangun, Paul E. Ohno, Dongyue Liang, Alicia C. McGeachy, Ariana Gray Be, Naomi Dalchand, Tianzhe Li, Qiang Cui, Franz M. Geiger

Abstract: We report vibrational sum frequency generation spectra from supported lipid bilayers in which the OH and the CH stretching signals are probed at different salt concentrations. Atomistic simulations show a negligible impact of salt on the OH stretching spectra, indicating the observed SFG intensity changes are due to chi(3) and potential dependent contributions. These are further analyzed in the co… ▽ More We report vibrational sum frequency generation spectra from supported lipid bilayers in which the OH and the CH stretching signals are probed at different salt concentrations. Atomistic simulations show a negligible impact of salt on the OH stretching spectra, indicating the observed SFG intensity changes are due to chi(3) and potential dependent contributions. These are further analyzed in the contact of exact-zero reference states. Further experiments and simulations identify specific hydrogen bonding interactions between interfacial water molecules at the PC head group of the zwitterionic DMPC lipids at 3200 wavenumbers. △ Less

Submitted 3 March, 2018; originally announced March 2018.

Comments: Pre-edited version, 14 Manuscript pages, 5 Figures, 1 Table, 29 pages Supporting Information available upon request

arXiv:1712.09086 [pdf]

doi 10.1021/acs.jpca.8b02802

On Second-Order Vibrational Lineshapes of the Air/Water Interface

Authors: Paul Ohno, Hong-fei Wang, James Skinner, Francesco Paesani, Franz M. Geiger

Abstract: We explore by means of modeling how absorptive-dispersive mixing between the second- and third-order terms modify the imaginary chi(2)total responses from air/water interfaces under conditions of varying charge densities and ionic strength. To do so, we use published Im(chi(2)) and chi(3) spectra of the neat air/water interface that were obtained either from computations or experiments. We find th… ▽ More We explore by means of modeling how absorptive-dispersive mixing between the second- and third-order terms modify the imaginary chi(2)total responses from air/water interfaces under conditions of varying charge densities and ionic strength. To do so, we use published Im(chi(2)) and chi(3) spectra of the neat air/water interface that were obtained either from computations or experiments. We find that the chi(2)total spectral lineshapes corresponding to experimentally measured spectra contain significant contributions from both interfacial chi(2) and bulk chi(3) terms at interfacial charge densities equivalent to less than 0.005% of a monolayer of water molecules, especially in the 3100 wavenumber to 3300 wavenumber frequency region. Additionally, the role of short-range static dipole potentials is examined under conditions mimicking brine. Our results indicate that surface potentials, if indeed present at the air/water interface, manifest themselves spectroscopically in the tightly bonded H-bond network observable in the 3200 wavenumber frequency range. △ Less

Submitted 23 March, 2018; v1 submitted 25 December, 2017; originally announced December 2017.

Comments: Pre-edited version, 21 pages, 3 figures, supporting information available upon request

arXiv:1703.03686 [pdf]

doi 10.1038/s41467-017-01088-0

Second-Order Spectral Lineshapes from Charged Interfaces

Authors: Paul E. Ohno, Hong-fei Wang, Franz M. Geiger

Abstract: Second-order nonlinear spectroscopy has proven to be a powerful tool in elucidating key chemical and structural characteristics at a variety of interfaces. However, the presence of interfacial potentials may lead to complications regarding the interpretation of second harmonic and vibrational sum frequency generation responses from charged interfaces due to mixing of absorptive and dispersive cont… ▽ More Second-order nonlinear spectroscopy has proven to be a powerful tool in elucidating key chemical and structural characteristics at a variety of interfaces. However, the presence of interfacial potentials may lead to complications regarding the interpretation of second harmonic and vibrational sum frequency generation responses from charged interfaces due to mixing of absorptive and dispersive contributions. Here, we examine by means of mathematical modeling how this interaction influences second-order spectral lineshapes. We discuss our findings in the context of reported nonlinear optical spectra obtained from charged water/air and solid/liquid interfaces and demonstrate the importance of accounting for the interfacial potential-dependent \c{hi}(3) term in interpreting lineshapes when seeking molecular information from charged interfaces using second-order spectroscopy. △ Less

Submitted 7 September, 2017; v1 submitted 10 March, 2017; originally announced March 2017.

Comments: Pre-edited final accepted version, 24 Pages, 6 Figures, Supplementary Data 1-7 available upon request to [email protected]

Journal ref: Nature Communications 8, Article number: 1032 (2017)

arXiv:1702.02496 [pdf]

doi 10.1063/1.5011977

Relative Permittivity in the Electrical Double Layer from Nonlinear Optics

Authors: Mavis D. Boamah, Paul E. Ohno, Franz M. Geiger, Kenneth B. Eisenthal

Abstract: Second harmonic generation (SHG) spectroscopy has been applied to probe the fused silica/water interface at pH 7 and the uncharged 11bar02 sapphire/water interface at pH 5.2 in contact with aqueous solutions of NaCl, NaBr, NaI, KCl, RbCl, and CsCl as low as several 10 microM. For ionic strengths up to about 0.1 mM, the SHG responses were observed to increase, reversibly for all salts surveyed, whe… ▽ More Second harmonic generation (SHG) spectroscopy has been applied to probe the fused silica/water interface at pH 7 and the uncharged 11bar02 sapphire/water interface at pH 5.2 in contact with aqueous solutions of NaCl, NaBr, NaI, KCl, RbCl, and CsCl as low as several 10 microM. For ionic strengths up to about 0.1 mM, the SHG responses were observed to increase, reversibly for all salts surveyed, when compared to the condition of zero salt added. Further increases in the salt concentration led to monotonic decreases in the SHG response. The SHG increases followed by decreases are found to be consistent with recent reports of phase interference and phase matching in nonlinear optics. By varying the relative permittivity employed in common mean field theories used to describe electrical double layers, and by comparing our results to available literature data, we find that models recapitulating the experimental observations are ones in which 1) the relative permittivity of the diffuse layer is that of bulk water, with other possible values as low as 30, 2) the surface charge density varies with salt concentration, and 3) the charge in the Stern layer or its thickness vary with salt concentration. We also note that the experimental data exhibit sensitivity depending on whether the salt concentration is increased from low to high values or decreased from high to low values, which, however, is not borne out in the fits, at least within the current uncertainties associated with the model point estimates. △ Less

Submitted 19 December, 2017; v1 submitted 8 February, 2017; originally announced February 2017.

Comments: Pre-edited final accepted version, 15 Pages main text, 1 Table, 5 Figures, Mathematica notebook available upon request

Journal ref: Journal of Chemical Physics, 148, 222808 (2018)

arXiv:1606.09310 [pdf]

doi 10.1038/ncomms13587

Phase-referenced Nonlinear Spectroscopy of the alpha-Quartz/Water Interface

Authors: Paul E. Ohno, Sarah A. Saslow, Hong-fei Wang, Franz M. Geiger, Kenneth B. Eisenthal

Abstract: Probing the polarization of water molecules at charged interfaces by second harmonic generation spectroscopy has been heretofore limited to isotropic materials. Here, we report non-resonant nonlinear optical measurements at the interface of anisotropic z-cut α-quartz and water under conditions of dynamically changing ionic strength and bulk solution pH. We find that the product of the third-order… ▽ More Probing the polarization of water molecules at charged interfaces by second harmonic generation spectroscopy has been heretofore limited to isotropic materials. Here, we report non-resonant nonlinear optical measurements at the interface of anisotropic z-cut α-quartz and water under conditions of dynamically changing ionic strength and bulk solution pH. We find that the product of the third-order susceptibility and the interfacial potential, \c{hi}(3). Φ(0), is given by ( \c{hi}(3)-i\c{hi}(3)). Φ(0), and that the interference between this product and the second-order susceptibility of bulk quartz depends on the rotation angle of α-quartz around the z-axis. Our experiments show that this newly identified term, i\c{hi}(3). Φ(0), which is out of phase from the surface terms, is of bulk origin. The possibility of internally phase referencing the interfacial response for the interfacial orientation analysis of species or materials in contact with α-quartz is discussed along with the implications for conditions of resonance enhancement. △ Less

Submitted 13 October, 2016; v1 submitted 29 June, 2016; originally announced June 2016.

Comments: Pre-edited, final accepted and updated version, 19 pages, 4 figures, 11 pages supporting information with 8 Supplementary Figures, one Supplementary Table, and one Supplementary Note

Journal ref: Nature Communications 7, 13587 (2016)

arXiv:1411.1034 [pdf]

doi 10.1038/ncomms7539

Aqueous Proton Transfer Across Single Layer Graphene

Authors: Jennifer L. Achtyl, Raymond R. Unocic, Lijun Xu, Yu Cai, Muralikrishna Raju, Weiwei Zhang, Robert L. Sacci, Ivan V. Vlassiouk, Pasquale F. Fulvio, Panchapakesan Ganesh, David J. Wesolowski, Sheng Dai, Adri C. T. van Duin, Matthew Neurock, Franz M. Geiger

Abstract: Proton transfer across single layer graphene is associated with large computed energy barriers and is therefore thought to be unfavorable at room temperature unless nanoscale holes or dopants are introduced, or a potential bias is applied. Here, we subject single layer graphene supported on fused silica to cycles of high and low pH and show that protons transfer reversibly from the aqueous phase t… ▽ More Proton transfer across single layer graphene is associated with large computed energy barriers and is therefore thought to be unfavorable at room temperature unless nanoscale holes or dopants are introduced, or a potential bias is applied. Here, we subject single layer graphene supported on fused silica to cycles of high and low pH and show that protons transfer reversibly from the aqueous phase through the graphene to the other side where they undergo acid-base chemistry with the silica hydroxyl groups. After ruling out diffusion through macroscopic pinholes, the protons are found to transfer through rare, naturally occurring atomic defects. Computer simulations reveal low energy barriers of 0.68 to 0.75 eV for aqueous proton transfer across hydroxyl-terminated atomic defects that participate in a Grotthuss-type relay, while pyrylium-like ether terminations shut down proton exchange. Unfavorable energy barriers to helium and hydrogen transfer indicate the transfer process is selective for aqueous protons. △ Less

Submitted 30 January, 2015; v1 submitted 4 November, 2014; originally announced November 2014.

Comments: 80 pages, including Supporting Information, 3 Figures, 1 Table, final pre-edited version

Journal ref: Nature Communications 6, 6539 (2015)

Showing 1–24 of 24 results for author: Geiger, M