Search | arXiv e-print repository

arXiv:2206.04663 [pdf, other]

Provably efficient variational generative modeling of quantum many-body systems via quantum-probabilistic information geometry

Authors: Faris M. Sbahi, Antonio J. Martinez, Sahil Patel, Dmitri Saberi, Jae Hyeon Yoo, Geoffrey Roeder, Guillaume Verdon

Abstract: The dual tasks of quantum Hamiltonian learning and quantum Gibbs sampling are relevant to many important problems in physics and chemistry. In the low temperature regime, algorithms for these tasks often suffer from intractabilities, for example from poor sample- or time-complexity. With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent… ▽ More The dual tasks of quantum Hamiltonian learning and quantum Gibbs sampling are relevant to many important problems in physics and chemistry. In the low temperature regime, algorithms for these tasks often suffer from intractabilities, for example from poor sample- or time-complexity. With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent to parameterized mixed states, as well as provide a robust first-order approximating algorithm, Quantum-Probabilistic Mirror Descent. We prove data sample efficiency for the dual tasks using tools from information geometry and quantum metrology, thus generalizing the seminal result of classical Fisher efficiency to a variational quantum algorithm for the first time. Our approaches extend previously sample-efficient techniques to allow for flexibility in model choice, including to spectrally-decomposed models like Quantum Hamiltonian-Based Models, which may circumvent intractable time complexities. Our first-order algorithm is derived using a novel quantum generalization of the classical mirror descent duality. Both results require a special choice of metric, namely, the Bogoliubov-Kubo-Mori metric. To test our proposed algorithms numerically, we compare their performance to existing baselines on the task of quantum Gibbs sampling for the transverse field Ising model. Finally, we propose an initialization strategy leveraging geometric locality for the modelling of sequences of states such as those arising from quantum-stochastic processes. We demonstrate its effectiveness empirically for both real and imaginary time evolution while defining a broader class of potential applications. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: 24 + 49 pages, 5 + 4 figures

arXiv:2205.16008 [pdf, other]

doi 10.1145/3623263.3623356

More Stiffness with Less Fiber: End-to-End Fiber Path Optimization for 3D-Printed Composites

Authors: Xingyuan Sun, Geoffrey Roeder, Tianju Xue, Ryan P. Adams, Szymon Rusinkiewicz

Abstract: In 3D printing, stiff fibers (e.g., carbon fiber) can reinforce thermoplastic polymers with limited stiffness. However, existing commercial digital manufacturing software only provides a few simple fiber layout algorithms, which solely use the geometry of the shape. In this work, we build an automated fiber path planning algorithm that maximizes the stiffness of a 3D print given specified external… ▽ More In 3D printing, stiff fibers (e.g., carbon fiber) can reinforce thermoplastic polymers with limited stiffness. However, existing commercial digital manufacturing software only provides a few simple fiber layout algorithms, which solely use the geometry of the shape. In this work, we build an automated fiber path planning algorithm that maximizes the stiffness of a 3D print given specified external loads. We formalize this as an optimization problem: an objective function is designed to measure the stiffness of the object while regularizing certain properties of fiber paths (e.g., smoothness). To initialize each fiber path, we use finite element analysis to calculate the stress field on the object and greedily "walk" in the direction of the stress field. We then apply a gradient-based optimization algorithm that uses the adjoint method to calculate the gradient of stiffness with respect to fiber layout. We compare our approach, in both simulation and real-world experiments, to three baselines: (1) concentric fiber rings generated by Eiger, a leading digital manufacturing software package developed by Markforged, (2) greedy extraction on the simulated stress field (i.e., our method without optimization), and (3) the greedy algorithm on a fiber orientation field calculated by smoothing the simulated stress fields. The results show that objects with fiber paths generated by our algorithm achieve greater stiffness while using less fiber than the baselines--our algorithm improves the Pareto frontier of object stiffness as a function of fiber usage. Ablation studies show that the smoothing regularizer is needed for feasible fiber paths and stability of optimization, and multi-resolution optimization helps reduce the running time compared to single-resolution optimization. △ Less

Submitted 29 October, 2023; v1 submitted 31 May, 2022; originally announced May 2022.

Comments: ACM SCF 2023: Proceedings of the 8th Annual ACM Symposium on Computational Fabrication

arXiv:2106.15666 [pdf, other]

Probabilistic Graphical Models and Tensor Networks: A Hybrid Framework

Authors: Jacob Miller, Geoffrey Roeder, Tai-Danae Bradley

Abstract: We investigate a correspondence between two formalisms for discrete probabilistic modeling: probabilistic graphical models (PGMs) and tensor networks (TNs), a powerful modeling framework for simulating complex quantum systems. The graphical calculus of PGMs and TNs exhibits many similarities, with discrete undirected graphical models (UGMs) being a special case of TNs. However, more general probab… ▽ More We investigate a correspondence between two formalisms for discrete probabilistic modeling: probabilistic graphical models (PGMs) and tensor networks (TNs), a powerful modeling framework for simulating complex quantum systems. The graphical calculus of PGMs and TNs exhibits many similarities, with discrete undirected graphical models (UGMs) being a special case of TNs. However, more general probabilistic TN models such as Born machines (BMs) employ complex-valued hidden states to produce novel forms of correlation among the probabilities. While representing a new modeling resource for capturing structure in discrete probability distributions, this behavior also renders the direct application of standard PGM tools impossible. We aim to bridge this gap by introducing a hybrid PGM-TN formalism that integrates quantum-like correlations into PGM models in a principled manner, using the physically-motivated concept of decoherence. We first prove that applying decoherence to the entirety of a BM model converts it into a discrete UGM, and conversely, that any subgraph of a discrete UGM can be represented as a decohered BM. This method allows a broad family of probabilistic TN models to be encoded as partially decohered BMs, a fact we leverage to combine the representational strengths of both model families. We experimentally verify the performance of such hybrid models in a sequential modeling task, and identify promising uses of our method within the context of existing applications of graphical models. △ Less

Submitted 29 June, 2021; originally announced June 2021.

Comments: 18 pages, 11 figures

arXiv:2007.00810 [pdf, other]

On Linear Identifiability of Learned Representations

Authors: Geoffrey Roeder, Luke Metz, Diederik P. Kingma

Abstract: Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data. We study identifiability in the context of representation learning: discovering nonlinear data representations that are optimal with respect to some downstream task. When parameterized as deep neural… ▽ More Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data. We study identifiability in the context of representation learning: discovering nonlinear data representations that are optimal with respect to some downstream task. When parameterized as deep neural networks, such representation functions typically lack identifiability in parameter space, because they are overparameterized by design. In this paper, building on recent advances in nonlinear ICA, we aim to rehabilitate identifiability by showing that a large family of discriminative models are in fact identifiable in function space, up to a linear indeterminacy. Many models for representation learning in a wide variety of domains have been identifiable in this sense, including text, images and audio, state-of-the-art at time of publication. We derive sufficient conditions for linear identifiability and provide empirical support for the result on both simulated and real-world data. △ Less

Submitted 7 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

arXiv:2005.06549 [pdf, other]

Learning Composable Energy Surrogates for PDE Order Reduction

Authors: Alex Beatson, Jordan T. Ash, Geoffrey Roeder, Tianju Xue, Ryan P. Adams

Abstract: Meta-materials are an important emerging class of engineered materials in which complex macroscopic behaviour--whether electromagnetic, thermal, or mechanical--arises from modular substructure. Simulation and optimization of these materials are computationally challenging, as rich substructures necessitate high-fidelity finite element meshes to solve the governing PDEs. To address this, we leverag… ▽ More Meta-materials are an important emerging class of engineered materials in which complex macroscopic behaviour--whether electromagnetic, thermal, or mechanical--arises from modular substructure. Simulation and optimization of these materials are computationally challenging, as rich substructures necessitate high-fidelity finite element meshes to solve the governing PDEs. To address this, we leverage parametric modular structure to learn component-level surrogates, enabling cheaper high-fidelity simulation. We use a neural network to model the stored potential energy in a component given boundary conditions. This yields a structured prediction task: macroscopic behavior is determined by the minimizer of the system's total potential energy, which can be approximated by composing these surrogate models. Composable energy surrogates thus permit simulation in the reduced basis of component boundaries. Costly ground-truth simulation of the full structure is avoided, as training data are generated by performing finite element analysis with individual components. Using dataset aggregation to choose training boundary conditions allows us to learn energy surrogates which produce accurate macroscopic behavior when composed, accelerating simulation of parametric meta-materials. △ Less

Submitted 15 May, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

arXiv:1905.12090 [pdf, other]

Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems

Authors: Geoffrey Roeder, Paul K. Grant, Andrew Phillips, Neil Dalchau, Edward Meeds

Abstract: We introduce a flexible, scalable Bayesian inference framework for nonlinear dynamical systems characterised by distinct and hierarchical variability at the individual, group, and population levels. Our model class is a generalisation of nonlinear mixed-effects (NLME) dynamical systems, the statistical workhorse for many experimental sciences. We cast parameter inference as stochastic optimisation… ▽ More We introduce a flexible, scalable Bayesian inference framework for nonlinear dynamical systems characterised by distinct and hierarchical variability at the individual, group, and population levels. Our model class is a generalisation of nonlinear mixed-effects (NLME) dynamical systems, the statistical workhorse for many experimental sciences. We cast parameter inference as stochastic optimisation of an end-to-end differentiable, block-conditional variational autoencoder. We specify the dynamics of the data-generating process as an ordinary differential equation (ODE) such that both the ODE and its solver are fully differentiable. This model class is highly flexible: the ODE right-hand sides can be a mixture of user-prescribed or "white-box" sub-components and neural network or "black-box" sub-components. Using stochastic optimisation, our amortised inference algorithm could seamlessly scale up to massive data collection pipelines (common in labs with robotic automation). Finally, our framework supports interpretability with respect to the underlying dynamics, as well as predictive generalization to unseen combinations of group components (also called "zero-shot" learning). We empirically validate our method by predicting the dynamic behaviour of bacteria that were genetically engineered to function as biosensors. Our implementation of the framework, the dataset, and all code to reproduce the experimental results is available at https://www.github.com/Microsoft/vi-hds . △ Less

Submitted 1 October, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

Comments: Published in "Proceedings of Machine Learning Research, Volume 97: International Conference on Machine Learning, 9-15 June 2019, Long Beach, California, USA"

arXiv:1711.00123 [pdf, other]

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

Authors: Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, David Duvenaud

Abstract: Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables. Our m… ▽ More Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables. Our method uses gradients of a neural network trained jointly with model parameters or policies, and is applicable in both discrete and continuous settings. We demonstrate this framework for training discrete latent-variable models. We also give an unbiased, action-conditional extension of the advantage actor-critic reinforcement learning algorithm. △ Less

Submitted 23 February, 2018; v1 submitted 31 October, 2017; originally announced November 2017.

Comments: Published at ICLR 2018

arXiv:1703.09194 [pdf, other]

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

Authors: Geoffrey Roeder, Yuhuai Wu, David Duvenaud

Abstract: We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound. Specifically, we remove a part of the total derivative with respect to the variational parameters that corresponds to the score function. Removing this term produces an unbiased gradient estimator whose variance approaches zero as the approximate posterior approaches… ▽ More We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound. Specifically, we remove a part of the total derivative with respect to the variational parameters that corresponds to the score function. Removing this term produces an unbiased gradient estimator whose variance approaches zero as the approximate posterior approaches the exact posterior. We analyze the behavior of this gradient estimator theoretically and empirically, and generalize it to more complex variational distributions such as mixtures and importance-weighted posteriors. △ Less

Submitted 28 May, 2017; v1 submitted 27 March, 2017; originally announced March 2017.

arXiv:1104.5350 [pdf, ps, other]

doi 10.1209/0295-5075/94/67002

Photoabsorption spectra and the X-ray edge problem in graphene

Authors: Georg Roeder, Grigory Tkachov, Martina Hentschel

Abstract: We study the photoabsorption cross section and Fermi-edge singularities (FES) in graphene. For fillings below one half, we find, besides the expected FES in form of a peaked edge at the threshold (Fermi) energy, a second singularity to arise at excitation energies that correspond to the Dirac point in the density of states. We can explain this behaviour by comparing our results with the photoabsor… ▽ More We study the photoabsorption cross section and Fermi-edge singularities (FES) in graphene. For fillings below one half, we find, besides the expected FES in form of a peaked edge at the threshold (Fermi) energy, a second singularity to arise at excitation energies that correspond to the Dirac point in the density of states. We can explain this behaviour by comparing our results with the photoabsorption cross section of a metal with a small central band gap where we find a very similar signature. The existence of the second singularity might prove useful for an experimental determination of the Dirac point. We also demonstrate that the photoabsorption signal is enhanced by the zigzag edge states due to their metallic-like character. Since the presence of the edge states indicates a topological defect at the boundary, our study gives an example for a Fermi-edge singularity in a system with a topologically nontrivial electronic spectrum. △ Less

Submitted 28 April, 2011; originally announced April 2011.

Comments: accepted for publication in Europhysics Letters (2011)

Journal ref: EPL 94, 67002 (2011)

arXiv:0705.4447 [pdf, ps, other]

doi 10.1143/PTPS.166.143

Many-body effects in the mesoscopic x-ray edge problem

Authors: Martina Hentschel, Georg Roeder, Denis Ullmo

Abstract: Many-body phenomena, a key interest in the investigation of bulk solid state systems, are studied here in the context of the x-ray edge problem for mesoscopic systems. We investigate the many-body effects associated with the sudden perturbation following the x-ray excitation of a core electron into the conduction band. For small systems with dimensions at the nanoscale we find considerable devia… ▽ More Many-body phenomena, a key interest in the investigation of bulk solid state systems, are studied here in the context of the x-ray edge problem for mesoscopic systems. We investigate the many-body effects associated with the sudden perturbation following the x-ray excitation of a core electron into the conduction band. For small systems with dimensions at the nanoscale we find considerable deviations from the well-understood metallic case where Anderson orthogonality catastrophe and the Mahan-Nozieres-DeDominicis response cause characteristic deviations of the photoabsorption cross section from the naive expectation. Whereas the K-edge is typically rounded in metallic systems, we find a slightly peaked K-edge in generic mesoscopic systems with chaotic-coherent electron dynamics. Thus the behavior of the photoabsorption cross section at threshold depends on the system size and is different for the metallic and the mesoscopic case. △ Less

Submitted 30 May, 2007; originally announced May 2007.

Comments: 9 pages, 3 figures, Proceedings ``Quantum Mechanics and Chaos'' (Osaka 2006)

Journal ref: Prog. Theor. Phys. Suppl. 166 (2007), 143-151

Showing 1–10 of 10 results for author: Roeder, G