-
Provably efficient variational generative modeling of quantum many-body systems via quantum-probabilistic information geometry
Authors:
Faris M. Sbahi,
Antonio J. Martinez,
Sahil Patel,
Dmitri Saberi,
Jae Hyeon Yoo,
Geoffrey Roeder,
Guillaume Verdon
Abstract:
The dual tasks of quantum Hamiltonian learning and quantum Gibbs sampling are relevant to many important problems in physics and chemistry. In the low temperature regime, algorithms for these tasks often suffer from intractabilities, for example from poor sample- or time-complexity. With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent…
▽ More
The dual tasks of quantum Hamiltonian learning and quantum Gibbs sampling are relevant to many important problems in physics and chemistry. In the low temperature regime, algorithms for these tasks often suffer from intractabilities, for example from poor sample- or time-complexity. With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent to parameterized mixed states, as well as provide a robust first-order approximating algorithm, Quantum-Probabilistic Mirror Descent. We prove data sample efficiency for the dual tasks using tools from information geometry and quantum metrology, thus generalizing the seminal result of classical Fisher efficiency to a variational quantum algorithm for the first time. Our approaches extend previously sample-efficient techniques to allow for flexibility in model choice, including to spectrally-decomposed models like Quantum Hamiltonian-Based Models, which may circumvent intractable time complexities. Our first-order algorithm is derived using a novel quantum generalization of the classical mirror descent duality. Both results require a special choice of metric, namely, the Bogoliubov-Kubo-Mori metric. To test our proposed algorithms numerically, we compare their performance to existing baselines on the task of quantum Gibbs sampling for the transverse field Ising model. Finally, we propose an initialization strategy leveraging geometric locality for the modelling of sequences of states such as those arising from quantum-stochastic processes. We demonstrate its effectiveness empirically for both real and imaginary time evolution while defining a broader class of potential applications.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
More Stiffness with Less Fiber: End-to-End Fiber Path Optimization for 3D-Printed Composites
Authors:
Xingyuan Sun,
Geoffrey Roeder,
Tianju Xue,
Ryan P. Adams,
Szymon Rusinkiewicz
Abstract:
In 3D printing, stiff fibers (e.g., carbon fiber) can reinforce thermoplastic polymers with limited stiffness. However, existing commercial digital manufacturing software only provides a few simple fiber layout algorithms, which solely use the geometry of the shape. In this work, we build an automated fiber path planning algorithm that maximizes the stiffness of a 3D print given specified external…
▽ More
In 3D printing, stiff fibers (e.g., carbon fiber) can reinforce thermoplastic polymers with limited stiffness. However, existing commercial digital manufacturing software only provides a few simple fiber layout algorithms, which solely use the geometry of the shape. In this work, we build an automated fiber path planning algorithm that maximizes the stiffness of a 3D print given specified external loads. We formalize this as an optimization problem: an objective function is designed to measure the stiffness of the object while regularizing certain properties of fiber paths (e.g., smoothness). To initialize each fiber path, we use finite element analysis to calculate the stress field on the object and greedily "walk" in the direction of the stress field. We then apply a gradient-based optimization algorithm that uses the adjoint method to calculate the gradient of stiffness with respect to fiber layout. We compare our approach, in both simulation and real-world experiments, to three baselines: (1) concentric fiber rings generated by Eiger, a leading digital manufacturing software package developed by Markforged, (2) greedy extraction on the simulated stress field (i.e., our method without optimization), and (3) the greedy algorithm on a fiber orientation field calculated by smoothing the simulated stress fields. The results show that objects with fiber paths generated by our algorithm achieve greater stiffness while using less fiber than the baselines--our algorithm improves the Pareto frontier of object stiffness as a function of fiber usage. Ablation studies show that the smoothing regularizer is needed for feasible fiber paths and stability of optimization, and multi-resolution optimization helps reduce the running time compared to single-resolution optimization.
△ Less
Submitted 29 October, 2023; v1 submitted 31 May, 2022;
originally announced May 2022.
-
Probabilistic Graphical Models and Tensor Networks: A Hybrid Framework
Authors:
Jacob Miller,
Geoffrey Roeder,
Tai-Danae Bradley
Abstract:
We investigate a correspondence between two formalisms for discrete probabilistic modeling: probabilistic graphical models (PGMs) and tensor networks (TNs), a powerful modeling framework for simulating complex quantum systems. The graphical calculus of PGMs and TNs exhibits many similarities, with discrete undirected graphical models (UGMs) being a special case of TNs. However, more general probab…
▽ More
We investigate a correspondence between two formalisms for discrete probabilistic modeling: probabilistic graphical models (PGMs) and tensor networks (TNs), a powerful modeling framework for simulating complex quantum systems. The graphical calculus of PGMs and TNs exhibits many similarities, with discrete undirected graphical models (UGMs) being a special case of TNs. However, more general probabilistic TN models such as Born machines (BMs) employ complex-valued hidden states to produce novel forms of correlation among the probabilities. While representing a new modeling resource for capturing structure in discrete probability distributions, this behavior also renders the direct application of standard PGM tools impossible. We aim to bridge this gap by introducing a hybrid PGM-TN formalism that integrates quantum-like correlations into PGM models in a principled manner, using the physically-motivated concept of decoherence. We first prove that applying decoherence to the entirety of a BM model converts it into a discrete UGM, and conversely, that any subgraph of a discrete UGM can be represented as a decohered BM. This method allows a broad family of probabilistic TN models to be encoded as partially decohered BMs, a fact we leverage to combine the representational strengths of both model families. We experimentally verify the performance of such hybrid models in a sequential modeling task, and identify promising uses of our method within the context of existing applications of graphical models.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
On Linear Identifiability of Learned Representations
Authors:
Geoffrey Roeder,
Luke Metz,
Diederik P. Kingma
Abstract:
Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data. We study identifiability in the context of representation learning: discovering nonlinear data representations that are optimal with respect to some downstream task. When parameterized as deep neural…
▽ More
Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data. We study identifiability in the context of representation learning: discovering nonlinear data representations that are optimal with respect to some downstream task. When parameterized as deep neural networks, such representation functions typically lack identifiability in parameter space, because they are overparameterized by design. In this paper, building on recent advances in nonlinear ICA, we aim to rehabilitate identifiability by showing that a large family of discriminative models are in fact identifiable in function space, up to a linear indeterminacy. Many models for representation learning in a wide variety of domains have been identifiable in this sense, including text, images and audio, state-of-the-art at time of publication. We derive sufficient conditions for linear identifiability and provide empirical support for the result on both simulated and real-world data.
△ Less
Submitted 7 July, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Learning Composable Energy Surrogates for PDE Order Reduction
Authors:
Alex Beatson,
Jordan T. Ash,
Geoffrey Roeder,
Tianju Xue,
Ryan P. Adams
Abstract:
Meta-materials are an important emerging class of engineered materials in which complex macroscopic behaviour--whether electromagnetic, thermal, or mechanical--arises from modular substructure. Simulation and optimization of these materials are computationally challenging, as rich substructures necessitate high-fidelity finite element meshes to solve the governing PDEs. To address this, we leverag…
▽ More
Meta-materials are an important emerging class of engineered materials in which complex macroscopic behaviour--whether electromagnetic, thermal, or mechanical--arises from modular substructure. Simulation and optimization of these materials are computationally challenging, as rich substructures necessitate high-fidelity finite element meshes to solve the governing PDEs. To address this, we leverage parametric modular structure to learn component-level surrogates, enabling cheaper high-fidelity simulation. We use a neural network to model the stored potential energy in a component given boundary conditions. This yields a structured prediction task: macroscopic behavior is determined by the minimizer of the system's total potential energy, which can be approximated by composing these surrogate models. Composable energy surrogates thus permit simulation in the reduced basis of component boundaries. Costly ground-truth simulation of the full structure is avoided, as training data are generated by performing finite element analysis with individual components. Using dataset aggregation to choose training boundary conditions allows us to learn energy surrogates which produce accurate macroscopic behavior when composed, accelerating simulation of parametric meta-materials.
△ Less
Submitted 15 May, 2020; v1 submitted 13 May, 2020;
originally announced May 2020.
-
Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems
Authors:
Geoffrey Roeder,
Paul K. Grant,
Andrew Phillips,
Neil Dalchau,
Edward Meeds
Abstract:
We introduce a flexible, scalable Bayesian inference framework for nonlinear dynamical systems characterised by distinct and hierarchical variability at the individual, group, and population levels. Our model class is a generalisation of nonlinear mixed-effects (NLME) dynamical systems, the statistical workhorse for many experimental sciences. We cast parameter inference as stochastic optimisation…
▽ More
We introduce a flexible, scalable Bayesian inference framework for nonlinear dynamical systems characterised by distinct and hierarchical variability at the individual, group, and population levels. Our model class is a generalisation of nonlinear mixed-effects (NLME) dynamical systems, the statistical workhorse for many experimental sciences. We cast parameter inference as stochastic optimisation of an end-to-end differentiable, block-conditional variational autoencoder. We specify the dynamics of the data-generating process as an ordinary differential equation (ODE) such that both the ODE and its solver are fully differentiable. This model class is highly flexible: the ODE right-hand sides can be a mixture of user-prescribed or "white-box" sub-components and neural network or "black-box" sub-components. Using stochastic optimisation, our amortised inference algorithm could seamlessly scale up to massive data collection pipelines (common in labs with robotic automation). Finally, our framework supports interpretability with respect to the underlying dynamics, as well as predictive generalization to unseen combinations of group components (also called "zero-shot" learning). We empirically validate our method by predicting the dynamic behaviour of bacteria that were genetically engineered to function as biosensors. Our implementation of the framework, the dataset, and all code to reproduce the experimental results is available at https://www.github.com/Microsoft/vi-hds .
△ Less
Submitted 1 October, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Authors:
Will Grathwohl,
Dami Choi,
Yuhuai Wu,
Geoffrey Roeder,
David Duvenaud
Abstract:
Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables. Our m…
▽ More
Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables. Our method uses gradients of a neural network trained jointly with model parameters or policies, and is applicable in both discrete and continuous settings. We demonstrate this framework for training discrete latent-variable models. We also give an unbiased, action-conditional extension of the advantage actor-critic reinforcement learning algorithm.
△ Less
Submitted 23 February, 2018; v1 submitted 31 October, 2017;
originally announced November 2017.
-
Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference
Authors:
Geoffrey Roeder,
Yuhuai Wu,
David Duvenaud
Abstract:
We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound. Specifically, we remove a part of the total derivative with respect to the variational parameters that corresponds to the score function. Removing this term produces an unbiased gradient estimator whose variance approaches zero as the approximate posterior approaches…
▽ More
We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound. Specifically, we remove a part of the total derivative with respect to the variational parameters that corresponds to the score function. Removing this term produces an unbiased gradient estimator whose variance approaches zero as the approximate posterior approaches the exact posterior. We analyze the behavior of this gradient estimator theoretically and empirically, and generalize it to more complex variational distributions such as mixtures and importance-weighted posteriors.
△ Less
Submitted 28 May, 2017; v1 submitted 27 March, 2017;
originally announced March 2017.
-
Photoabsorption spectra and the X-ray edge problem in graphene
Authors:
Georg Roeder,
Grigory Tkachov,
Martina Hentschel
Abstract:
We study the photoabsorption cross section and Fermi-edge singularities (FES) in graphene. For fillings below one half, we find, besides the expected FES in form of a peaked edge at the threshold (Fermi) energy, a second singularity to arise at excitation energies that correspond to the Dirac point in the density of states. We can explain this behaviour by comparing our results with the photoabsor…
▽ More
We study the photoabsorption cross section and Fermi-edge singularities (FES) in graphene. For fillings below one half, we find, besides the expected FES in form of a peaked edge at the threshold (Fermi) energy, a second singularity to arise at excitation energies that correspond to the Dirac point in the density of states. We can explain this behaviour by comparing our results with the photoabsorption cross section of a metal with a small central band gap where we find a very similar signature. The existence of the second singularity might prove useful for an experimental determination of the Dirac point. We also demonstrate that the photoabsorption signal is enhanced by the zigzag edge states due to their metallic-like character. Since the presence of the edge states indicates a topological defect at the boundary, our study gives an example for a Fermi-edge singularity in a system with a topologically nontrivial electronic spectrum.
△ Less
Submitted 28 April, 2011;
originally announced April 2011.
-
Many-body effects in the mesoscopic x-ray edge problem
Authors:
Martina Hentschel,
Georg Roeder,
Denis Ullmo
Abstract:
Many-body phenomena, a key interest in the investigation of bulk solid state systems, are studied here in the context of the x-ray edge problem for mesoscopic systems. We investigate the many-body effects associated with the sudden perturbation following the x-ray excitation of a core electron into the conduction band. For small systems with dimensions at the nanoscale we find considerable devia…
▽ More
Many-body phenomena, a key interest in the investigation of bulk solid state systems, are studied here in the context of the x-ray edge problem for mesoscopic systems. We investigate the many-body effects associated with the sudden perturbation following the x-ray excitation of a core electron into the conduction band. For small systems with dimensions at the nanoscale we find considerable deviations from the well-understood metallic case where Anderson orthogonality catastrophe and the Mahan-Nozieres-DeDominicis response cause characteristic deviations of the photoabsorption cross section from the naive expectation. Whereas the K-edge is typically rounded in metallic systems, we find a slightly peaked K-edge in generic mesoscopic systems with chaotic-coherent electron dynamics. Thus the behavior of the photoabsorption cross section at threshold depends on the system size and is different for the metallic and the mesoscopic case.
△ Less
Submitted 30 May, 2007;
originally announced May 2007.