Search | arXiv e-print repository

Fast and Unified Path Gradient Estimators for Normalizing Flows

Authors: Lorenz Vaitl, Ludwig Winkler, Lorenz Richter, Pan Kessel

Abstract: Recent work shows that path gradient estimators for normalizing flows have lower variance compared to standard estimators for variational inference, resulting in improved training. However, they are often prohibitively more expensive from a computational point of view and cannot be applied to maximum likelihood training in a scalable manner, which severely hinders their widespread adoption. In thi… ▽ More Recent work shows that path gradient estimators for normalizing flows have lower variance compared to standard estimators for variational inference, resulting in improved training. However, they are often prohibitively more expensive from a computational point of view and cannot be applied to maximum likelihood training in a scalable manner, which severely hinders their widespread adoption. In this work, we overcome these crucial limitations. Specifically, we propose a fast path gradient estimator which improves computational efficiency significantly and works for all normalizing flow architectures of practical relevance. We then show that this estimator can also be applied to maximum likelihood training for which it has a regularizing effect as it can take the form of a given target energy function into account. We empirically establish its superior performance and reduced variance for several natural sciences applications. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2212.08469 [pdf, other]

doi 10.1103/PhysRevD.107.L051504

Learning Trivializing Gradient Flows for Lattice Gauge Theories

Authors: Simone Bacchio, Pan Kessel, Stefan Schaefer, Lorenz Vaitl

Abstract: We propose a unifying approach that starts from the perturbative construction of trivializing maps by Lüscher and then improves on it by learning. The resulting continuous normalizing flow model can be implemented using common tools of lattice field theory and requires several orders of magnitude fewer parameters than any existing machine learning approach. Specifically, our model can achieve comp… ▽ More We propose a unifying approach that starts from the perturbative construction of trivializing maps by Lüscher and then improves on it by learning. The resulting continuous normalizing flow model can be implemented using common tools of lattice field theory and requires several orders of magnitude fewer parameters than any existing machine learning approach. Specifically, our model can achieve competitive performance with as few as 14 parameters while existing deep-learning models have around 1 million parameters for $SU(3)$ Yang--Mills theory on a $16^2$ lattice. This has obvious consequences for training speed and interpretability. It also provides a plausible path for scaling machine-learning approaches toward realistic theories. △ Less

Submitted 9 March, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: 10 pages, 4 figures, 1 table

arXiv:2207.08219 [pdf, other]

Gradients should stay on Path: Better Estimators of the Reverse- and Forward KL Divergence for Normalizing Flows

Authors: Lorenz Vaitl, Kim A. Nicoli, Shinichi Nakajima, Pan Kessel

Abstract: We propose an algorithm to estimate the path-gradient of both the reverse and forward Kullback-Leibler divergence for an arbitrary manifestly invertible normalizing flow. The resulting path-gradient estimators are straightforward to implement, have lower variance, and lead not only to faster convergence of training but also to better overall approximation results compared to standard total gradien… ▽ More We propose an algorithm to estimate the path-gradient of both the reverse and forward Kullback-Leibler divergence for an arbitrary manifestly invertible normalizing flow. The resulting path-gradient estimators are straightforward to implement, have lower variance, and lead not only to faster convergence of training but also to better overall approximation results compared to standard total gradient estimators. We also demonstrate that path-gradient training is less susceptible to mode-collapse. In light of our results, we expect that path-gradient estimators will become the new standard method to train normalizing flows for variational inference. △ Less

Submitted 17 July, 2022; originally announced July 2022.

Comments: 29 pages, 8 figures

arXiv:2206.09016 [pdf, other]

Path-Gradient Estimators for Continuous Normalizing Flows

Authors: Lorenz Vaitl, Kim A. Nicoli, Shinichi Nakajima, Pan Kessel

Abstract: Recent work has established a path-gradient estimator for simple variational Gaussian distributions and has argued that the path-gradient is particularly beneficial in the regime in which the variational distribution approaches the exact target distribution. In many applications, this regime can however not be reached by a simple Gaussian variational distribution. In this work, we overcome this cr… ▽ More Recent work has established a path-gradient estimator for simple variational Gaussian distributions and has argued that the path-gradient is particularly beneficial in the regime in which the variational distribution approaches the exact target distribution. In many applications, this regime can however not be reached by a simple Gaussian variational distribution. In this work, we overcome this crucial limitation by proposing a path-gradient estimator for the considerably more expressive variational family of continuous normalizing flows. We outline an efficient algorithm to calculate this estimator and establish its superior performance empirically. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: 8 pages, 5 figures, 39th International Conference on Machine Learning

Showing 1–4 of 4 results for author: Vaitl, L