Search | arXiv e-print repository

A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization

Authors: Sebastian Sanokowski, Sepp Hochreiter, Sebastian Lehner

Abstract: Learning to sample from intractable distributions over discrete sets without relying on corresponding training data is a central problem in a wide range of fields, including Combinatorial Optimization. Currently, popular deep learning-based approaches rely primarily on generative models that yield exact sample likelihoods. This work introduces a method that lifts this restriction and opens the pos… ▽ More Learning to sample from intractable distributions over discrete sets without relying on corresponding training data is a central problem in a wide range of fields, including Combinatorial Optimization. Currently, popular deep learning-based approaches rely primarily on generative models that yield exact sample likelihoods. This work introduces a method that lifts this restriction and opens the possibility to employ highly expressive latent variable models like diffusion models. Our approach is conceptually based on a loss that upper bounds the reverse Kullback-Leibler divergence and evades the requirement of exact sample likelihoods. We experimentally validate our approach in data-free Combinatorial Optimization and demonstrate that our method achieves a new state-of-the-art on a wide range of benchmark problems. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: Accepted at ICML 2024

arXiv:2311.14156 [pdf, other]

Variational Annealing on Graphs for Combinatorial Optimization

Authors: Sebastian Sanokowski, Wilhelm Berghammer, Sepp Hochreiter, Sebastian Lehner

Abstract: Several recent unsupervised learning methods use probabilistic approaches to solve combinatorial optimization (CO) problems based on the assumption of statistically independent solution variables. We demonstrate that this assumption imposes performance limitations in particular on difficult problem instances. Our results corroborate that an autoregressive approach which captures statistical depend… ▽ More Several recent unsupervised learning methods use probabilistic approaches to solve combinatorial optimization (CO) problems based on the assumption of statistically independent solution variables. We demonstrate that this assumption imposes performance limitations in particular on difficult problem instances. Our results corroborate that an autoregressive approach which captures statistical dependencies among solution variables yields superior performance on many popular CO problems. We introduce subgraph tokenization in which the configuration of a set of solution variables is represented by a single token. This tokenization technique alleviates the drawback of the long sequential sampling procedure which is inherent to autoregressive methods without sacrificing expressivity. Importantly, we theoretically motivate an annealed entropy regularization and show empirically that it is essential for efficient and stable learning. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2301.13169 [pdf, other]

Improved machine learning algorithm for predicting ground state properties

Authors: Laura Lewis, Hsin-Yuan Huang, Viet T. Tran, Sebastian Lehner, Richard Kueng, John Preskill

Abstract: Finding the ground state of a quantum many-body system is a fundamental problem in quantum physics. In this work, we give a classical machine learning (ML) algorithm for predicting ground state properties with an inductive bias encoding geometric locality. The proposed ML model can efficiently predict ground state properties of an $n$-qubit gapped local Hamiltonian after learning from only… ▽ More Finding the ground state of a quantum many-body system is a fundamental problem in quantum physics. In this work, we give a classical machine learning (ML) algorithm for predicting ground state properties with an inductive bias encoding geometric locality. The proposed ML model can efficiently predict ground state properties of an $n$-qubit gapped local Hamiltonian after learning from only $\mathcal{O}(\log(n))$ data about other Hamiltonians in the same quantum phase of matter. This improves substantially upon previous results that require $\mathcal{O}(n^c)$ data for a large constant $c$. Furthermore, the training and prediction time of the proposed ML model scale as $\mathcal{O}(n \log n)$ in the number of qubits $n$. Numerical experiments on physical systems with up to 45 qubits confirm the favorable scaling in predicting ground state properties using a small training dataset. △ Less

Submitted 30 January, 2023; originally announced January 2023.

Comments: 8 pages, 5 figures + 32-page appendix

arXiv:2210.04248 [pdf, other]

doi 10.1093/mnras/stac2933

Residual Neural Networks for the Prediction of Planetary Collision Outcomes

Authors: Philip M. Winter, Christoph Burger, Sebastian Lehner, Johannes Kofler, Thomas I. Maindl, Christoph M. Schäfer

Abstract: Fast and accurate treatment of collisions in the context of modern N-body planet formation simulations remains a challenging task due to inherently complex collision processes. We aim to tackle this problem with machine learning (ML), in particular via residual neural networks. Our model is motivated by the underlying physical processes of the data-generating process and allows for flexible predic… ▽ More Fast and accurate treatment of collisions in the context of modern N-body planet formation simulations remains a challenging task due to inherently complex collision processes. We aim to tackle this problem with machine learning (ML), in particular via residual neural networks. Our model is motivated by the underlying physical processes of the data-generating process and allows for flexible prediction of post-collision states. We demonstrate that our model outperforms commonly used collision handling methods such as perfect inelastic merging and feed-forward neural networks in both prediction accuracy and out-of-distribution generalization. Our model outperforms the current state of the art in 20/24 experiments. We provide a dataset that consists of 10164 Smooth Particle Hydrodynamics (SPH) simulations of pairwise planetary collisions. The dataset is specifically suited for ML research to improve computational aspects for collision treatment and for studying planetary collisions in general. We formulate the ML task as a multi-task regression problem, allowing simple, yet efficient training of ML models for collision treatment in an end-to-end manner. Our models can be easily integrated into existing N-body frameworks and can be used within our chosen parameter space of initial conditions, i.e. where similar-sized collisions during late-stage terrestrial planet formation typically occur. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: 13 pages, 7 figures, 7 tables

MSC Class: 70F16 ACM Class: E.1; I.6.6; I.2.0

arXiv:2206.03483 [pdf, other]

Few-Shot Learning by Dimensionality Reduction in Gradient Space

Authors: Martin Gauch, Maximilian Beck, Thomas Adler, Dmytro Kotsur, Stefan Fiel, Hamid Eghbal-zadeh, Johannes Brandstetter, Johannes Kofler, Markus Holzleitner, Werner Zellinger, Daniel Klotz, Sepp Hochreiter, Sebastian Lehner

Abstract: We introduce SubGD, a novel few-shot learning method which is based on the recent finding that stochastic gradient descent updates tend to live in a low-dimensional parameter subspace. In experimental and theoretical analyses, we show that models confined to a suitable predefined subspace generalize well for few-shot learning. A suitable subspace fulfills three criteria across the given tasks: it… ▽ More We introduce SubGD, a novel few-shot learning method which is based on the recent finding that stochastic gradient descent updates tend to live in a low-dimensional parameter subspace. In experimental and theoretical analyses, we show that models confined to a suitable predefined subspace generalize well for few-shot learning. A suitable subspace fulfills three criteria across the given tasks: it (a) allows to reduce the training error by gradient flow, (b) leads to models that generalize well, and (c) can be identified by stochastic gradient descent. SubGD identifies these subspaces from an eigendecomposition of the auto-correlation matrix of update directions across different tasks. Demonstrably, we can identify low-dimensional suitable subspaces for few-shot learning of dynamical systems, which have varying properties described by one or few parameters of the analytical system description. Such systems are ubiquitous among real-world applications in science and engineering. We experimentally corroborate the advantages of SubGD on three distinct dynamical systems problem settings, significantly outperforming popular few-shot learning methods both in terms of sample efficiency and performance. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: Accepted at Conference on Lifelong Learning Agents (CoLLAs) 2022. Code: https://github.com/ml-jku/subgd Blog post: https://ml-jku.github.io/subgd

Journal ref: Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:1043-1064 (2022)

arXiv:2205.12258 [pdf, other]

History Compression via Language Models in Reinforcement Learning

Authors: Fabian Paischer, Thomas Adler, Vihang Patil, Angela Bitto-Nemling, Markus Holzleitner, Sebastian Lehner, Hamid Eghbal-zadeh, Sepp Hochreiter

Abstract: In a partially observable Markov decision process (POMDP), an agent typically uses a representation of the past to approximate the underlying MDP. We propose to utilize a frozen Pretrained Language Transformer (PLT) for history representation and compression to improve sample efficiency. To avoid training of the Transformer, we introduce FrozenHopfield, which automatically associates observations… ▽ More In a partially observable Markov decision process (POMDP), an agent typically uses a representation of the past to approximate the underlying MDP. We propose to utilize a frozen Pretrained Language Transformer (PLT) for history representation and compression to improve sample efficiency. To avoid training of the Transformer, we introduce FrozenHopfield, which automatically associates observations with pretrained token embeddings. To form these associations, a modern Hopfield network stores these token embeddings, which are retrieved by queries that are obtained by a random but fixed projection of observations. Our new method, HELM, enables actor-critic network architectures that contain a pretrained language Transformer for history representation as a memory module. Since a representation of the past need not be learned, HELM is much more sample efficient than competitors. On Minigrid and Procgen environments HELM achieves new state-of-the-art results. Our code is available at https://github.com/ml-jku/helm. △ Less

Submitted 21 February, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

Comments: ICML 2022

arXiv:2106.11299 [pdf, other]

Boundary Graph Neural Networks for 3D Simulations

Authors: Andreas Mayr, Sebastian Lehner, Arno Mayrhofer, Christoph Kloss, Sepp Hochreiter, Johannes Brandstetter

Abstract: The abundance of data has given machine learning considerable momentum in natural sciences and engineering, though modeling of physical processes is often difficult. A particularly tough problem is the efficient representation of geometric boundaries. Triangularized geometric boundaries are well understood and ubiquitous in engineering applications. However, it is notoriously difficult to integrat… ▽ More The abundance of data has given machine learning considerable momentum in natural sciences and engineering, though modeling of physical processes is often difficult. A particularly tough problem is the efficient representation of geometric boundaries. Triangularized geometric boundaries are well understood and ubiquitous in engineering applications. However, it is notoriously difficult to integrate them into machine learning approaches due to their heterogeneity with respect to size and orientation. In this work, we introduce an effective theory to model particle-boundary interactions, which leads to our new Boundary Graph Neural Networks (BGNNs) that dynamically modify graph structures to obey boundary conditions. The new BGNNs are tested on complex 3D granular flow processes of hoppers, rotating drums and mixers, which are all standard components of modern industrial machinery but still have complicated geometry. BGNNs are evaluated in terms of computational efficiency as well as prediction accuracy of particle flows and mixing entropies. BGNNs are able to accurately reproduce 3D granular flows within simulation uncertainties over hundreds of thousands of simulation timesteps. Most notably, in our experiments, particles stay within the geometric objects without using handcrafted conditions or restrictions. △ Less

Submitted 20 April, 2023; v1 submitted 21 June, 2021; originally announced June 2021.

Comments: accepted for presentation at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)

arXiv:2105.01636 [pdf, other]

Learning 3D Granular Flow Simulations

Authors: Andreas Mayr, Sebastian Lehner, Arno Mayrhofer, Christoph Kloss, Sepp Hochreiter, Johannes Brandstetter

Abstract: Recently, the application of machine learning models has gained momentum in natural sciences and engineering, which is a natural fit due to the abundance of data in these fields. However, the modeling of physical processes from simulation data without first principle solutions remains difficult. Here, we present a Graph Neural Networks approach towards accurate modeling of complex 3D granular flow… ▽ More Recently, the application of machine learning models has gained momentum in natural sciences and engineering, which is a natural fit due to the abundance of data in these fields. However, the modeling of physical processes from simulation data without first principle solutions remains difficult. Here, we present a Graph Neural Networks approach towards accurate modeling of complex 3D granular flow simulation processes created by the discrete element method LIGGGHTS and concentrate on simulations of physical systems found in real world applications like rotating drums and hoppers. We discuss how to implement Graph Neural Networks that deal with 3D objects, boundary conditions, particle - particle, and particle - boundary interactions such that an accurate modeling of relevant physical quantities is made possible. Finally, we compare the machine learning based trajectories to LIGGGHTS trajectories in terms of particle flows and mixing entropies. △ Less

Submitted 4 May, 2021; originally announced May 2021.

arXiv:1909.02508 [pdf, other]

Dielectron production at low transverse momentum in Pb-Pb collisions at $ \sqrt{s_{\mathrm{NN}}}=5.02 $ TeV with ALICE

Authors: Sebastian Lehner

Abstract: Dielectrons probe a wide range of phenomena in heavy-ion collisions. These include light- and heavy-flavour meson production, thermal radiation and coherent photo-production. The latter process is distinguished by dielectron production at low transverse-pair momentum ($ \it{p}_{\textrm{T,ee}}$). Transverse momentum spectra of dielectrons in central and peripheral Pb-Pb collisions are extracted and… ▽ More Dielectrons probe a wide range of phenomena in heavy-ion collisions. These include light- and heavy-flavour meson production, thermal radiation and coherent photo-production. The latter process is distinguished by dielectron production at low transverse-pair momentum ($ \it{p}_{\textrm{T,ee}}$). Transverse momentum spectra of dielectrons in central and peripheral Pb-Pb collisions are extracted and compared to the corresponding expectations. In central collisions the data fit the expected spectrum, which is dominated by semi-leptonic decays of correlated heavy-flavour hadrons. In peripheral collisions, the data exhibit an excess at low \ptee with respect to hadronic and thermal sources. The observed excess yield is compatible with calculations for dielectron production from coherent photon-photon interactions. △ Less

Submitted 12 November, 2019; v1 submitted 5 September, 2019; originally announced September 2019.

arXiv:1606.02586 [pdf, other]

doi 10.1103/PhysRevAccelBeams.19.100702

Commissioning experience and beam physics measurements at the SwissFEL Injector Test Facility

Authors: T. Schietinger, M. Pedrozzi, M. Aiba, V. Arsov, S. Bettoni, B. Beutner, M. Calvi, P. Craievich, M. Dehler, F. Frei, R. Ganter, C. P. Hauri, R. Ischebeck, Y. Ivanisenko, M. Janousch, M. Kaiser, B. Keil, F. Löhl, G. L. Orlandi, C. Ozkan Loch, P. Peier, E. Prat, J. -Y. Raguin, S. Reiche, T. Schilcher , et al. (70 additional authors not shown)

Abstract: The SwissFEL Injector Test Facility operated at the Paul Scherrer Institute between 2010 and 2014, serving as a pilot plant and testbed for the development and realization of SwissFEL, the X-ray Free-Electron Laser facility under construction at the same institute. The test facility consisted of a laser-driven rf electron gun followed by an S-band booster linac, a magnetic bunch compression chican… ▽ More The SwissFEL Injector Test Facility operated at the Paul Scherrer Institute between 2010 and 2014, serving as a pilot plant and testbed for the development and realization of SwissFEL, the X-ray Free-Electron Laser facility under construction at the same institute. The test facility consisted of a laser-driven rf electron gun followed by an S-band booster linac, a magnetic bunch compression chicane and a diagnostic section including a transverse deflecting rf cavity. It delivered electron bunches of up to 200 pC charge and up to 250 MeV beam energy at a repetition rate of 10 Hz. The measurements performed at the test facility not only demonstrated the beam parameters required to drive the first stage of an FEL facility, but also led to significant advances in instrumentation technologies, beam characterization methods and the generation, transport and compression of ultra-low-emittance beams. We give a comprehensive overview of the commissioning experience of the principal subsystems and the beam physics measurements performed during the operation of the test facility, including the results of the test of an in-vacuum undulator prototype generating radiation in the vacuum ultraviolet and optical range. △ Less

Submitted 27 October, 2016; v1 submitted 8 June, 2016; originally announced June 2016.

Comments: 60 pages, 70 figures; final version as published, except with table of contents and Fig. 68 in full resolution

Journal ref: Phys. Rev. ST Accel. Beams 19, 100702 (2016)

arXiv:1606.01791 [pdf, other]

doi 10.1007/s10751-016-1309-2

Towards Measuring the Ground State Hyperfine Splitting of Antihydrogen -- A Progress Report

Authors: C. Sauerzopf, A. Capon, M. Diermaier, P. Dupré, Y. Higashi, C. Kaga, B. Kolbinger, M. Leali, S. Lehner, E. Lodi Rizzini, C. Malbrunot, V. Mascagna, O. Massiczek, D. J. Murtagh, Y. Nagata, B. Radics, M. C. Simon, K. Suzuki, M. Tajima, S. Ulmer, S. Vamosi, S. van Gorp, J. Zmeskal, H. Breuker, H. Higaki , et al. (6 additional authors not shown)

Abstract: We report the successful commissioning and testing of a dedicated field-ioniser chamber for measuring principal quantum number distributions in antihydrogen as part of the ASACUSA hyperfine spectroscopy apparatus. The new chamber is combined with a beam normalisation detector that consists of plastic scintillators and a retractable passivated implanted planar silicon (PIPS) detector. We report the successful commissioning and testing of a dedicated field-ioniser chamber for measuring principal quantum number distributions in antihydrogen as part of the ASACUSA hyperfine spectroscopy apparatus. The new chamber is combined with a beam normalisation detector that consists of plastic scintillators and a retractable passivated implanted planar silicon (PIPS) detector. △ Less

Submitted 6 June, 2016; originally announced June 2016.

Comments: 5 pages, 2 figures, 6th International Symposium on Symmetries in Subatomic Physics (SSP2015)

arXiv:1603.02875 [pdf, ps, other]

Asymptotic behaviour of the Riemann map** function at analytic cusps

Authors: Tobias Kaiser, Sabrina Lehner

Abstract: We completely describe the asymptotic behaviour of the Riemann map** function and its derivatives at an analytic cusp. We achieve the same for the inverse of the map** function. We completely describe the asymptotic behaviour of the Riemann map** function and its derivatives at an analytic cusp. We achieve the same for the inverse of the map** function. △ Less

Submitted 12 April, 2016; v1 submitted 9 March, 2016; originally announced March 2016.

Comments: Final version; to appear at Annales Academiae Scientiarum Fennicae Mathematica

MSC Class: 30C20; 30E15

arXiv:1501.01420 [pdf, other]

doi 10.1007/s10751-015-1130-3

Numerical Simulations of Hyperfine Transitions of Antihydrogen

Authors: B. Kolbinger, A. Capon, M. Diermaier, S. Lehner, C. Malbrunot, O. Massiczek, C. Sauerzopf, M. C. Simon, E. Widmann

Abstract: One of the ASACUSA (Atomic Spectroscopy And Collisions Using Slow Antiprotons) collaboration's goals is the measurement of the ground state hyperfine transition frequency in antihydrogen, the antimatter counterpart of one of the best known systems in physics. This high precision experiment yields a sensitive test of the fundamental symmetry of CPT. Numerical simulations of hyperfine transitions of… ▽ More One of the ASACUSA (Atomic Spectroscopy And Collisions Using Slow Antiprotons) collaboration's goals is the measurement of the ground state hyperfine transition frequency in antihydrogen, the antimatter counterpart of one of the best known systems in physics. This high precision experiment yields a sensitive test of the fundamental symmetry of CPT. Numerical simulations of hyperfine transitions of antihydrogen atoms have been performed providing information on the required antihydrogen events and the achievable precision. △ Less

Submitted 23 January, 2015; v1 submitted 7 January, 2015; originally announced January 2015.

arXiv:1402.1659 [pdf, ps, other]

doi 10.1007/s10751-014-1013-z

Spectroscopy Apparatus for the Measurement of The Hyperfine Structure of Antihydrogen

Authors: C. Malbrunot, P. Caradonna, M. Diermaier, N. Dilaver, S. Friedreich, B. Kolbinger, S. Lehner, R. Lundmark, O. Massiczek, B. Radics, C. Sauerzopf, M. Simon, E. Widmann, M. Wolf, B. Wunschek, J. Zmeskal

Abstract: The ASACUSA CUSP collaboration at the Antiproton Decelerator (AD) of CERN is planning to measure the ground-state hyperfine splitting of antihydrogen using an atomic spectroscopy beamline. We describe here the latest developments on the spectroscopy apparatus developed to be coupled to the antihydrogen production setup (CUSP). The ASACUSA CUSP collaboration at the Antiproton Decelerator (AD) of CERN is planning to measure the ground-state hyperfine splitting of antihydrogen using an atomic spectroscopy beamline. We describe here the latest developments on the spectroscopy apparatus developed to be coupled to the antihydrogen production setup (CUSP). △ Less

Submitted 7 February, 2014; originally announced February 2014.

Comments: Proceedings of the 11th International Conference on Low Energy Antiproton Physics (LEAP 2013) held in Uppsala, Sweden, 10 to 15 June, 2013

Showing 1–14 of 14 results for author: Lehner, S