Search | arXiv e-print repository

Equivariant Neural Tangent Kernels

Authors: Philipp Misof, Pan Kessel, Jan E. Gerken

Abstract: Equivariant neural networks have in recent years become an important technique for guiding architecture selection for neural networks with many applications in domains ranging from medical image analysis to quantum chemistry. In particular, as the most general linear equivariant layers with respect to the regular representation, group convolutions have been highly impactful in numerous application… ▽ More Equivariant neural networks have in recent years become an important technique for guiding architecture selection for neural networks with many applications in domains ranging from medical image analysis to quantum chemistry. In particular, as the most general linear equivariant layers with respect to the regular representation, group convolutions have been highly impactful in numerous applications. Although equivariant architectures have been studied extensively, much less is known about the training dynamics of equivariant neural networks. Concurrently, neural tangent kernels (NTKs) have emerged as a powerful tool to analytically understand the training dynamics of wide neural networks. In this work, we combine these two fields for the first time by giving explicit expressions for NTKs of group convolutional neural networks. In numerical experiments, we demonstrate superior performance for equivariant NTKs over non-equivariant NTKs on a classification task for medical images. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 13 pages + 5 pages appendices

arXiv:2406.06150 [pdf, other]

Physics-Informed Bayesian Optimization of Variational Quantum Circuits

Authors: Kim A. Nicoli, Christopher J. Anders, Lena Funcke, Tobias Hartung, Karl Jansen, Stefan Kühn, Klaus-Robert Müller, Paolo Stornati, Pan Kessel, Shinichi Nakajima

Abstract: In this paper, we propose a novel and powerful method to harness Bayesian optimization for Variational Quantum Eigensolvers (VQEs) -- a hybrid quantum-classical protocol used to approximate the ground state of a quantum Hamiltonian. Specifically, we derive a VQE-kernel which incorporates important prior information about quantum circuits: the kernel feature map of the VQE-kernel exactly matches th… ▽ More In this paper, we propose a novel and powerful method to harness Bayesian optimization for Variational Quantum Eigensolvers (VQEs) -- a hybrid quantum-classical protocol used to approximate the ground state of a quantum Hamiltonian. Specifically, we derive a VQE-kernel which incorporates important prior information about quantum circuits: the kernel feature map of the VQE-kernel exactly matches the known functional form of the VQE's objective function and thereby significantly reduces the posterior uncertainty. Moreover, we propose a novel acquisition function for Bayesian optimization called Expected Maximum Improvement over Confident Regions (EMICoRe) which can actively exploit the inductive bias of the VQE-kernel by treating regions with low predictive uncertainty as indirectly ``observed''. As a result, observations at as few as three points in the search domain are sufficient to determine the complete objective function along an entire one-dimensional subspace of the optimization landscape. Our numerical experiments demonstrate that our approach improves over state-of-the-art baselines. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 36 pages, 17 figures, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2403.15881 [pdf, other]

Fast and Unified Path Gradient Estimators for Normalizing Flows

Authors: Lorenz Vaitl, Ludwig Winkler, Lorenz Richter, Pan Kessel

Abstract: Recent work shows that path gradient estimators for normalizing flows have lower variance compared to standard estimators for variational inference, resulting in improved training. However, they are often prohibitively more expensive from a computational point of view and cannot be applied to maximum likelihood training in a scalable manner, which severely hinders their widespread adoption. In thi… ▽ More Recent work shows that path gradient estimators for normalizing flows have lower variance compared to standard estimators for variational inference, resulting in improved training. However, they are often prohibitively more expensive from a computational point of view and cannot be applied to maximum likelihood training in a scalable manner, which severely hinders their widespread adoption. In this work, we overcome these crucial limitations. Specifically, we propose a fast path gradient estimator which improves computational efficiency significantly and works for all normalizing flow architectures of practical relevance. We then show that this estimator can also be applied to maximum likelihood training for which it has a regularizing effect as it can take the form of a given target energy function into account. We empirically establish its superior performance and reduced variance for several natural sciences applications. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2403.03103 [pdf, other]

Emergent Equivariance in Deep Ensembles

Authors: Jan E. Gerken, Pan Kessel

Abstract: We show that deep ensembles become equivariant for all inputs and at all training times by simply using data augmentation. Crucially, equivariance holds off-manifold and for any architecture in the infinite width limit. The equivariance is emergent in the sense that predictions of individual ensemble members are not equivariant but their collective prediction is. Neural tangent kernel theory is us… ▽ More We show that deep ensembles become equivariant for all inputs and at all training times by simply using data augmentation. Crucially, equivariance holds off-manifold and for any architecture in the infinite width limit. The equivariance is emergent in the sense that predictions of individual ensemble members are not equivariant but their collective prediction is. Neural tangent kernel theory is used to derive this result and we verify our theoretical insights using detailed numerical experiments. △ Less

Submitted 15 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

Comments: 11 pages + 17 pages appendices

arXiv:2307.09379 [pdf, other]

Batched Predictors Generalize within Distribution

Authors: Andreas Loukas, Pan Kessel

Abstract: We study the generalization properties of batched predictors, i.e., models tasked with predicting the mean label of a small set (or batch) of examples. The batched prediction paradigm is particularly relevant for models deployed to determine the quality of a group of compounds in preparation for offline testing. By utilizing a suitable generalization of the Rademacher complexity, we prove that bat… ▽ More We study the generalization properties of batched predictors, i.e., models tasked with predicting the mean label of a small set (or batch) of examples. The batched prediction paradigm is particularly relevant for models deployed to determine the quality of a group of compounds in preparation for offline testing. By utilizing a suitable generalization of the Rademacher complexity, we prove that batched predictors come with exponentially stronger generalization guarantees as compared to the standard per-sample approach. Surprisingly, the proposed bound holds independently of overparametrization. Our theoretical insights are validated experimentally for various tasks, architectures, and applications. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Comments: 9 pages, 3 figures

arXiv:2302.14082 [pdf, other]

Detecting and Mitigating Mode-Collapse for Flow-based Sampling of Lattice Field Theories

Authors: Kim A. Nicoli, Christopher J. Anders, Tobias Hartung, Karl Jansen, Pan Kessel, Shinichi Nakajima

Abstract: We study the consequences of mode-collapse of normalizing flows in the context of lattice field theory. Normalizing flows allow for independent sampling. For this reason, it is hoped that they can avoid the tunneling problem of local-update MCMC algorithms for multi-modal distributions. In this work, we first point out that the tunneling problem is also present for normalizing flows but is shifted… ▽ More We study the consequences of mode-collapse of normalizing flows in the context of lattice field theory. Normalizing flows allow for independent sampling. For this reason, it is hoped that they can avoid the tunneling problem of local-update MCMC algorithms for multi-modal distributions. In this work, we first point out that the tunneling problem is also present for normalizing flows but is shifted from the sampling to the training phase of the algorithm. Specifically, normalizing flows often suffer from mode-collapse for which the training process assigns vanishingly low probability mass to relevant modes of the physical distribution. This may result in a significant bias when the flow is used as a sampler in a Markov-Chain or with Importance Sampling. We propose a metric to quantify the degree of mode-collapse and derive a bound on the resulting bias. Furthermore, we propose various mitigation strategies in particular in the context of estimating thermodynamic observables, such as the free energy. △ Less

Submitted 3 November, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: 10 pages, 7 figures, 6 pages of supplement material

arXiv:2212.08469 [pdf, other]

doi 10.1103/PhysRevD.107.L051504

Learning Trivializing Gradient Flows for Lattice Gauge Theories

Authors: Simone Bacchio, Pan Kessel, Stefan Schaefer, Lorenz Vaitl

Abstract: We propose a unifying approach that starts from the perturbative construction of trivializing maps by Lüscher and then improves on it by learning. The resulting continuous normalizing flow model can be implemented using common tools of lattice field theory and requires several orders of magnitude fewer parameters than any existing machine learning approach. Specifically, our model can achieve comp… ▽ More We propose a unifying approach that starts from the perturbative construction of trivializing maps by Lüscher and then improves on it by learning. The resulting continuous normalizing flow model can be implemented using common tools of lattice field theory and requires several orders of magnitude fewer parameters than any existing machine learning approach. Specifically, our model can achieve competitive performance with as few as 14 parameters while existing deep-learning models have around 1 million parameters for $SU(3)$ Yang--Mills theory on a $16^2$ lattice. This has obvious consequences for training speed and interpretability. It also provides a plausible path for scaling machine-learning approaches toward realistic theories. △ Less

Submitted 9 March, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: 10 pages, 4 figures, 1 table

arXiv:2207.08219 [pdf, other]

Gradients should stay on Path: Better Estimators of the Reverse- and Forward KL Divergence for Normalizing Flows

Authors: Lorenz Vaitl, Kim A. Nicoli, Shinichi Nakajima, Pan Kessel

Abstract: We propose an algorithm to estimate the path-gradient of both the reverse and forward Kullback-Leibler divergence for an arbitrary manifestly invertible normalizing flow. The resulting path-gradient estimators are straightforward to implement, have lower variance, and lead not only to faster convergence of training but also to better overall approximation results compared to standard total gradien… ▽ More We propose an algorithm to estimate the path-gradient of both the reverse and forward Kullback-Leibler divergence for an arbitrary manifestly invertible normalizing flow. The resulting path-gradient estimators are straightforward to implement, have lower variance, and lead not only to faster convergence of training but also to better overall approximation results compared to standard total gradient estimators. We also demonstrate that path-gradient training is less susceptible to mode-collapse. In light of our results, we expect that path-gradient estimators will become the new standard method to train normalizing flows for variational inference. △ Less

Submitted 17 July, 2022; originally announced July 2022.

Comments: 29 pages, 8 figures

arXiv:2206.09016 [pdf, other]

Path-Gradient Estimators for Continuous Normalizing Flows

Authors: Lorenz Vaitl, Kim A. Nicoli, Shinichi Nakajima, Pan Kessel

Abstract: Recent work has established a path-gradient estimator for simple variational Gaussian distributions and has argued that the path-gradient is particularly beneficial in the regime in which the variational distribution approaches the exact target distribution. In many applications, this regime can however not be reached by a simple Gaussian variational distribution. In this work, we overcome this cr… ▽ More Recent work has established a path-gradient estimator for simple variational Gaussian distributions and has argued that the path-gradient is particularly beneficial in the regime in which the variational distribution approaches the exact target distribution. In many applications, this regime can however not be reached by a simple Gaussian variational distribution. In this work, we overcome this crucial limitation by proposing a path-gradient estimator for the considerably more expressive variational family of continuous normalizing flows. We outline an efficient algorithm to calculate this estimator and establish its superior performance empirically. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: 8 pages, 5 figures, 39th International Conference on Machine Learning

arXiv:2206.05075 [pdf, other]

Diffeomorphic Counterfactuals with Generative Models

Authors: Ann-Kathrin Dombrowski, Jan E. Gerken, Klaus-Robert Müller, Pan Kessel

Abstract: Counterfactuals can explain classification decisions of neural networks in a human interpretable way. We propose a simple but effective method to generate such counterfactuals. More specifically, we perform a suitable diffeomorphic coordinate transformation and then perform gradient ascent in these coordinates to find counterfactuals which are classified with great confidence as a specified target… ▽ More Counterfactuals can explain classification decisions of neural networks in a human interpretable way. We propose a simple but effective method to generate such counterfactuals. More specifically, we perform a suitable diffeomorphic coordinate transformation and then perform gradient ascent in these coordinates to find counterfactuals which are classified with great confidence as a specified target class. We propose two methods to leverage generative models to construct such suitable coordinate systems that are either exactly or approximately diffeomorphic. We analyze the generation process theoretically using Riemannian differential geometry and validate the quality of the generated counterfactuals using various qualitative and quantitative measures. △ Less

Submitted 16 June, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

arXiv:2111.11303 [pdf, ps, other]

doi 10.22323/1.396.0338

Machine Learning of Thermodynamic Observables in the Presence of Mode Collapse

Authors: Kim A. Nicoli, Christopher Anders, Lena Funcke, Tobias Hartung, Karl Jansen, Pan Kessel, Shinichi Nakajima, Paolo Stornati

Abstract: Estimating the free energy, as well as other thermodynamic observables, is a key task in lattice field theories. Recently, it has been pointed out that deep generative models can be used in this context [1]. Crucially, these models allow for the direct estimation of the free energy at a given point in parameter space. This is in contrast to existing methods based on Markov chains which generically… ▽ More Estimating the free energy, as well as other thermodynamic observables, is a key task in lattice field theories. Recently, it has been pointed out that deep generative models can be used in this context [1]. Crucially, these models allow for the direct estimation of the free energy at a given point in parameter space. This is in contrast to existing methods based on Markov chains which generically require integration through parameter space. In this contribution, we will review this novel machine-learning-based estimation method. We will in detail discuss the issue of mode collapse and outline mitigation techniques which are particularly suited for applications at finite temperature. △ Less

Submitted 30 November, 2021; v1 submitted 22 November, 2021; originally announced November 2021.

Comments: 10 pages, 2 figures, Proceedings of the 38th International Symposium on Lattice Field Theory, 26th-30th July 2021, Zoom/Gather@Massachusetts Institute of Technology

Report number: MIT-CTP/5353

arXiv:2108.10105 [pdf, other]

doi 10.1103/PhysRevFluids.6.113801

Deep learning for surrogate modelling of 2D mantle convection

Authors: Siddhant Agarwal, Nicola Tosi, Pan Kessel, Doris Breuer, Grégoire Montavon

Abstract: Traditionally, 1D models based on scaling laws have been used to parameterized convective heat transfer rocks in the interior of terrestrial planets like Earth, Mars, Mercury and Venus to tackle the computational bottleneck of high-fidelity forward runs in 2D or 3D. However, these are limited in the amount of physics they can model (e.g. depth dependent material properties) and predict only mean q… ▽ More Traditionally, 1D models based on scaling laws have been used to parameterized convective heat transfer rocks in the interior of terrestrial planets like Earth, Mars, Mercury and Venus to tackle the computational bottleneck of high-fidelity forward runs in 2D or 3D. However, these are limited in the amount of physics they can model (e.g. depth dependent material properties) and predict only mean quantities such as the mean mantle temperature. We recently showed that feedforward neural networks (FNN) trained using a large number of 2D simulations can overcome this limitation and reliably predict the evolution of entire 1D laterally-averaged temperature profile in time for complex models. We now extend that approach to predict the full 2D temperature field, which contains more information in the form of convection structures such as hot plumes and cold downwellings. Using a dataset of 10,525 two-dimensional simulations of the thermal evolution of the mantle of a Mars-like planet, we show that deep learning techniques can produce reliable parameterized surrogates (i.e. surrogates that predict state variables such as temperature based only on parameters) of the underlying partial differential equations. We first use convolutional autoencoders to compress the temperature fields by a factor of 142 and then use FNN and long-short term memory networks (LSTM) to predict the compressed fields. On average, the FNN predictions are 99.30% and the LSTM predictions are 99.22% accurate with respect to unseen simulations. Proper orthogonal decomposition (POD) of the LSTM and FNN predictions shows that despite a lower mean absolute relative accuracy, LSTMs capture the flow dynamics better than FNNs. When summed, the POD coefficients from FNN predictions and from LSTM predictions amount to 96.51% and 97.66% relative to the coefficients of the original simulations, respectively. △ Less

Submitted 5 November, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

Journal ref: Physical Review Fluids, vol. 6, no. 11, 2021

arXiv:2012.10425 [pdf, other]

Towards Robust Explanations for Deep Neural Networks

Authors: Ann-Kathrin Dombrowski, Christopher J. Anders, Klaus-Robert Müller, Pan Kessel

Abstract: Explanation methods shed light on the decision process of black-box classifiers such as deep neural networks. But their usefulness can be compromised because they are susceptible to manipulations. With this work, we aim to enhance the resilience of explanations. We develop a unified theoretical framework for deriving bounds on the maximal manipulability of a model. Based on these theoretical insig… ▽ More Explanation methods shed light on the decision process of black-box classifiers such as deep neural networks. But their usefulness can be compromised because they are susceptible to manipulations. With this work, we aim to enhance the resilience of explanations. We develop a unified theoretical framework for deriving bounds on the maximal manipulability of a model. Based on these theoretical insights, we present three different techniques to boost robustness against manipulation: training with weight decay, smoothing activation functions, and minimizing the Hessian of the network. Our experimental results confirm the effectiveness of these approaches. △ Less

Submitted 18 December, 2020; originally announced December 2020.

arXiv:2007.09969 [pdf, other]

Fairwashing Explanations with Off-Manifold Detergent

Authors: Christopher J. Anders, Plamen Pasliev, Ann-Kathrin Dombrowski, Klaus-Robert Müller, Pan Kessel

Abstract: Explanation methods promise to make black-box classifiers more transparent. As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users. In this paper, we show both theoretically and experimentally that these hopes are presently unfounded. Specifically, we show that, for any c… ▽ More Explanation methods promise to make black-box classifiers more transparent. As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users. In this paper, we show both theoretically and experimentally that these hopes are presently unfounded. Specifically, we show that, for any classifier $g$, one can always construct another classifier $\tilde{g}$ which has the same behavior on the data (same train, validation, and test error) but has arbitrarily manipulated explanation maps. We derive this statement theoretically using differential geometry and demonstrate it experimentally for various explanation methods, architectures, and datasets. Motivated by our theoretical insights, we then propose a modification of existing explanation methods which makes them significantly more robust. △ Less

Submitted 20 July, 2020; originally announced July 2020.

Comments: 22 pages with 43 figures, to be published in ICML2020

arXiv:2007.07115 [pdf, other]

doi 10.1103/PhysRevLett.126.032001

Estimation of Thermodynamic Observables in Lattice Field Theories with Deep Generative Models

Authors: Kim A. Nicoli, Christopher J. Anders, Lena Funcke, Tobias Hartung, Karl Jansen, Pan Kessel, Shinichi Nakajima, Paolo Stornati

Abstract: In this work, we demonstrate that applying deep generative machine learning models for lattice field theory is a promising route for solving problems where Markov Chain Monte Carlo (MCMC) methods are problematic. More specifically, we show that generative models can be used to estimate the absolute value of the free energy, which is in contrast to existing MCMC-based methods which are limited to o… ▽ More In this work, we demonstrate that applying deep generative machine learning models for lattice field theory is a promising route for solving problems where Markov Chain Monte Carlo (MCMC) methods are problematic. More specifically, we show that generative models can be used to estimate the absolute value of the free energy, which is in contrast to existing MCMC-based methods which are limited to only estimate free energy differences. We demonstrate the effectiveness of the proposed method for two-dimensional $φ^4$ theory and compare it to MCMC-based methods in detailed numerical experiments. △ Less

Submitted 5 January, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 8 figures

Journal ref: Phys. Rev. Lett. 126, 032001 (2021)

arXiv:1910.13496 [pdf, other]

doi 10.1103/PhysRevE.101.023304

Asymptotically unbiased estimation of physical observables with neural samplers

Authors: Kim A. Nicoli, Shinichi Nakajima, Nils Strodthoff, Wojciech Samek, Klaus-Robert Müller, Pan Kessel

Abstract: We propose a general framework for the estimation of observables with generative neural samplers focusing on modern deep generative neural networks that provide an exact sampling probability. In this framework, we present asymptotically unbiased estimators for generic observables, including those that explicitly depend on the partition function such as free energy or entropy, and derive correspond… ▽ More We propose a general framework for the estimation of observables with generative neural samplers focusing on modern deep generative neural networks that provide an exact sampling probability. In this framework, we present asymptotically unbiased estimators for generic observables, including those that explicitly depend on the partition function such as free energy or entropy, and derive corresponding variance estimators. We demonstrate their practical applicability by numerical experiments for the 2d Ising model which highlight the superiority over existing methods. Our approach greatly enhances the applicability of generative neural samplers to real-world physical systems. △ Less

Submitted 13 February, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

Comments: 5 figures

Journal ref: Phys. Rev. E 101, 023304 (2020)

arXiv:1906.07983 [pdf, other]

Explanations can be manipulated and geometry is to blame

Authors: Ann-Kathrin Dombrowski, Maximilian Alber, Christopher J. Anders, Marcel Ackermann, Klaus-Robert Müller, Pan Kessel

Abstract: Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant. We establish t… ▽ More Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant. We establish theoretically that this phenomenon can be related to certain geometrical properties of neural networks. This allows us to derive an upper bound on the susceptibility of explanations to manipulations. Based on this result, we propose effective mechanisms to enhance the robustness of explanations. △ Less

Submitted 25 September, 2019; v1 submitted 19 June, 2019; originally announced June 2019.

arXiv:1903.11048 [pdf, other]

Comment on "Solving Statistical Mechanics Using VANs": Introducing saVANt - VANs Enhanced by Importance and MCMC Sampling

Authors: Kim Nicoli, Pan Kessel, Nils Strodthoff, Wojciech Samek, Klaus-Robert Müller, Shinichi Nakajima

Abstract: In this comment on "Solving Statistical Mechanics Using Variational Autoregressive Networks" by Wu et al., we propose a subtle yet powerful modification of their approach. We show that the inherent sampling error of their method can be corrected by using neural network-based MCMC or importance sampling which leads to asymptotically unbiased estimators for physical quantities. This modification is… ▽ More In this comment on "Solving Statistical Mechanics Using Variational Autoregressive Networks" by Wu et al., we propose a subtle yet powerful modification of their approach. We show that the inherent sampling error of their method can be corrected by using neural network-based MCMC or importance sampling which leads to asymptotically unbiased estimators for physical quantities. This modification is possible due to a singular property of VANs, namely that they provide the exact sample probability. With these modifications, we believe that their method could have a substantially greater impact on various important fields of physics, including strongly-interacting field theories and statistical physics. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 6 pages, 4 figures

arXiv:1810.09751 [pdf, other]

Analysis of Atomistic Representations Using Weighted Skip-Connections

Authors: Kim A. Nicoli, Pan Kessel, Michael Gastegger, Kristof T. Schütt

Abstract: In this work, we extend the SchNet architecture by using weighted skip connections to assemble the final representation. This enables us to study the relative importance of each interaction block for property prediction. We demonstrate on both the QM9 and MD17 dataset that their relative weighting depends strongly on the chemical composition and configurational degrees of freedom of the molecules… ▽ More In this work, we extend the SchNet architecture by using weighted skip connections to assemble the final representation. This enables us to study the relative importance of each interaction block for property prediction. We demonstrate on both the QM9 and MD17 dataset that their relative weighting depends strongly on the chemical composition and configurational degrees of freedom of the molecules which opens the path towards a more detailed understanding of machine learning models for molecules. △ Less

Submitted 14 November, 2018; v1 submitted 23 October, 2018; originally announced October 2018.

Comments: NIPS 2018 Workshop: Machine Learning for Molecules and Materials

arXiv:1809.01072 [pdf, other]

doi 10.1021/acs.jctc.8b00908

SchNetPack: A Deep Learning Toolbox For Atomistic Systems

Authors: K. T. Schütt, P. Kessel, M. Gastegger, K. Nicoli, A. Tkatchenko, K. -R. Müller

Abstract: SchNetPack is a toolbox for the development and application of deep neural networks to the prediction of potential energy surfaces and other quantum-chemical properties of molecules and materials. It contains basic building blocks of atomistic neural networks, manages their training and provides simple access to common benchmark datasets. This allows for an easy implementation and evaluation of ne… ▽ More SchNetPack is a toolbox for the development and application of deep neural networks to the prediction of potential energy surfaces and other quantum-chemical properties of molecules and materials. It contains basic building blocks of atomistic neural networks, manages their training and provides simple access to common benchmark datasets. This allows for an easy implementation and evaluation of new models. For now, SchNetPack includes implementations of (weighted) atomcentered symmetry functions and the deep tensor neural network SchNet as well as ready-to-use scripts that allow to train these models on molecule and material datasets. Based upon the PyTorch deep learning framework, SchNetPack allows to efficiently apply the neural networks to large datasets with millions of reference calculations as well as parallelize the model across multiple GPUs. Finally, SchNetPack provides an interface to the Atomic Simulation Environment in order to make trained models easily accessible to researchers that are not yet familiar with neural networks. △ Less

Submitted 4 September, 2018; originally announced September 2018.

arXiv:1805.07279 [pdf, other]

doi 10.1007/JHEP08(2018)076

Simple Unfolded Equations for Massive Higher Spins in AdS$_3$

Authors: Pan Kessel, Joris Raeymaekers

Abstract: We propose a simple unfolded description of free massive higher spin particles in anti-de-Sitter spacetime. While our unfolded equation of motion has the standard form of a covariant constancy condition, our formulation differs from the standard one in that our field takes values in a different internal space, which for us is simply a unitary irreducible representation of the symmetry group. Our m… ▽ More We propose a simple unfolded description of free massive higher spin particles in anti-de-Sitter spacetime. While our unfolded equation of motion has the standard form of a covariant constancy condition, our formulation differs from the standard one in that our field takes values in a different internal space, which for us is simply a unitary irreducible representation of the symmetry group. Our main result is the explicit construction, for the case of AdS$_3$, of a map from our formulation to the standard wave equations for massive higher spin particles, as well as to the unfolded description prevalent in the literature. It is hoped that our formulation may be used to clarify the group-theoretic content of interactions in higher spin theories. △ Less

Submitted 20 August, 2018; v1 submitted 18 May, 2018; originally announced May 2018.

Comments: 21 pages plus appendices. V2: typos corrected, references added, published version

Journal ref: JHEP 1808 (2018) 076

arXiv:1803.02737 [pdf, ps, other]

doi 10.1103/PhysRevD.97.106021

Cubic interactions of massless bosonic fields in three dimensions II: Parity-odd and Chern-Simons vertices

Authors: Pan Kessel, Karapet Mkrtchyan

Abstract: This work completes the classification of the cubic vertices for arbitrary spin massless bosons in three dimensions started in a previous companion paper by constructing parity-odd vertices. Similarly to the parity-even case, there is a unique parity-odd vertex for any given triple $s_1\geq s_2\geq s_3\geq 2$ of massless bosons if the triangle inequalities are satisfied ($s_1<s_2+s_3$) and none ot… ▽ More This work completes the classification of the cubic vertices for arbitrary spin massless bosons in three dimensions started in a previous companion paper by constructing parity-odd vertices. Similarly to the parity-even case, there is a unique parity-odd vertex for any given triple $s_1\geq s_2\geq s_3\geq 2$ of massless bosons if the triangle inequalities are satisfied ($s_1<s_2+s_3$) and none otherwise. These vertices involve two (three) derivatives for odd (even) values of the sum $s_1+s_2+s_3$. A non-trivial relation between parity-even and parity-odd vertices is found. Similarly to the parity-even case, the scalar and Maxwell matter can couple to higher spins through current couplings with higher derivatives. We comment on possible lessons for 2d CFT. We also derive both parity-even and parity-odd vertices with Chern-Simons fields and comment on the analogous classification in two dimensions. △ Less

Submitted 7 March, 2018; originally announced March 2018.

Comments: 29 pages

Journal ref: Phys. Rev. D 97, 106021 (2018)

arXiv:1702.03694 [pdf, ps, other]

The Very Basics of Higher-Spin Theory

Authors: Pan Kessel

Abstract: These notes are based on two lectures given at the Twelfth Modave Summer School in Mathematical Physics 2016. The Fronsdal equation and action for both Minkowski and (A)dS backgrounds are discussed in detail. These notes are based on two lectures given at the Twelfth Modave Summer School in Mathematical Physics 2016. The Fronsdal equation and action for both Minkowski and (A)dS backgrounds are discussed in detail. △ Less

Submitted 13 February, 2017; originally announced February 2017.

Comments: Contribution to the proceedings of the XII Modave Summer School in Mathematical Physics

arXiv:1508.04139 [pdf, other]

doi 10.1088/1751-8113/49/9/095402

Higher Spin Interactions in Four Dimensions: Vasiliev vs. Fronsdal

Authors: Nicolas Boulanger, Pan Kessel, E. D. Skvortsov, Massimo Taronna

Abstract: We consider four-dimensional Higher-Spin Theory at the first nontrivial order corresponding to the cubic action. All Higher-Spin interaction vertices are explicitly obtained from Vasiliev's equations. In particular, we obtain the vertices that are not determined solely by the Higher-Spin algebra structure constants. The dictionary between the Fronsdal fields and Higher-Spin connections is found an… ▽ More We consider four-dimensional Higher-Spin Theory at the first nontrivial order corresponding to the cubic action. All Higher-Spin interaction vertices are explicitly obtained from Vasiliev's equations. In particular, we obtain the vertices that are not determined solely by the Higher-Spin algebra structure constants. The dictionary between the Fronsdal fields and Higher-Spin connections is found and the corrections to the Fronsdal equations are derived. These corrections turn out to involve derivatives of arbitrary order. We observe that the vertices not determined by the Higher-Spin algebra produce naked infinities, when decomposed into the minimal derivative vertices and improvements. Therefore, standard methods can only be used to check a rather limited number of correlation functions within the HS AdS/CFT duality. A possible resolution of the puzzle is discussed. △ Less

Submitted 12 December, 2015; v1 submitted 17 August, 2015; originally announced August 2015.

Comments: 56 pages=40+Appendices; 1 figure; typos fixed, one ref added

arXiv:1505.05887 [pdf, ps, other]

doi 10.1007/JHEP11(2015)104

Higher Spins and Matter Interacting in Dimension Three

Authors: Pan Kessel, Gustavo Lucena Gomez, E. D. Skvortsov, Massimo Taronna

Abstract: The spectrum of Prokushkin--Vasiliev Theory is puzzling in light of the Gaberdiel--Gopakumar conjecture because it generically contains an additional sector besides higher-spin gauge and scalar fields. We find the unique truncation of the theory avoiding this problem to order 2 in perturbations around AdS$_3$. The second-order backreaction on the physical gauge sector induced by the scalars is com… ▽ More The spectrum of Prokushkin--Vasiliev Theory is puzzling in light of the Gaberdiel--Gopakumar conjecture because it generically contains an additional sector besides higher-spin gauge and scalar fields. We find the unique truncation of the theory avoiding this problem to order 2 in perturbations around AdS$_3$. The second-order backreaction on the physical gauge sector induced by the scalars is computed explicitly. The cubic action for the physical fields is determined completely. We comment on a different higher-spin theory without such additional fields at $λ=1$. △ Less

Submitted 17 November, 2015; v1 submitted 21 May, 2015; originally announced May 2015.

Comments: 55 pages + appendices, LaTex. Final version to appear in JHEP

arXiv:1408.2712 [pdf, ps, other]

doi 10.1088/1751-8113/48/3/035402

Metric- and frame-like higher-spin gauge theories in three dimensions

Authors: Stefan Fredenhagen, Pan Kessel

Abstract: We study the relation between the frame-like and metric-like formulation of higher-spin gauge theories in three space-time dimensions. We concentrate on the theory that is described by an SL(3) x SL(3) Chern-Simons theory in the frame-like formulation. The metric-like theory is obtained by eliminating the generalised spin connection by its equation of motion, and by expressing everything in terms… ▽ More We study the relation between the frame-like and metric-like formulation of higher-spin gauge theories in three space-time dimensions. We concentrate on the theory that is described by an SL(3) x SL(3) Chern-Simons theory in the frame-like formulation. The metric-like theory is obtained by eliminating the generalised spin connection by its equation of motion, and by expressing everything in terms of the metric and a spin-3 Fronsdal field. We give an exact map between fields and gauge parameters in both formulations. To work out the gauge transformations explicitly in terms of metric-like variables, we have to make a perturbative expansion in the spin-3 field. We describe an algorithm how to do this systematically, and we work out the gauge transformations to cubic order in the spin-3 field. We use these results to determine the gauge algebra to this order, and explain why the commutator of two spin-3 transformations only closes on-shell. △ Less

Submitted 12 August, 2014; originally announced August 2014.

Comments: 26 pages, no figures

Showing 1–26 of 26 results for author: Kessel, P