Skip to main content

Showing 1–43 of 43 results for author: Chen, R T Q

.
  1. arXiv:2406.04713  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.AI physics.comp-ph stat.ML

    FlowMM: Generating Materials with Riemannian Flow Matching

    Authors: Benjamin Kurt Miller, Ricky T. Q. Chen, Anuroop Sriram, Brandon M Wood

    Abstract: Crystalline materials are a fundamental component in next-generation technologies, yet modeling their distribution presents unique computational challenges. Of the plausible arrangements of atoms in a periodic lattice only a vanishingly small percentage are thermodynamically stable, which is a key indicator of the materials that can be experimentally realized. Two fundamental tasks in this area ar… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: https://github.com/facebookresearch/flowmm

    Journal ref: ICML 2024

  2. arXiv:2406.00288  [pdf, other

    cs.LG stat.ML

    Neural Optimal Transport with Lagrangian Costs

    Authors: Aram-Alexandre Pooladian, Carles Domingo-Enrich, Ricky T. Q. Chen, Brandon Amos

    Abstract: We investigate the optimal transport problem between probability measures when the underlying cost function is understood to satisfy a least action principle, also known as a Lagrangian cost. These generalizations are useful when connecting observations from a physical system where the transport dynamics are influenced by the geometry of the system, such as obstacles (e.g., incorporating barrier f… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: UAI 2024

  3. arXiv:2405.04795  [pdf, other

    cs.LG

    Variational Schrödinger Diffusion Models

    Authors: Wei Deng, Weijian Luo, Yixin Tan, Marin Biloš, Yu Chen, Yuriy Nevmyvaka, Ricky T. Q. Chen

    Abstract: Schrödinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize… ▽ More

    Submitted 19 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  4. arXiv:2404.08764  [pdf, other

    physics.chem-ph

    Leveraging Normalizing Flows for Orbital-Free Density Functional Theory

    Authors: Alexandre de Camargo, Ricky T. Q. Chen, Rodrigo A. Vargas-Hernández

    Abstract: Orbital-free density functional theory (OF-DFT) for real-space systems has historically depended on Lagrange optimization techniques, primarily due to the inability of previously proposed electron density ansatze to ensure the normalization constraint. This study illustrates how leveraging contemporary generative models, notably normalizing flows (NFs), can surmount this challenge. We pioneer a La… ▽ More

    Submitted 18 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 6 pages, 4 Figures, (SI: 15 pages, 4 figures)

  5. arXiv:2403.01329  [pdf, other

    cs.LG cs.AI cs.CV

    Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

    Authors: Neta Shaul, Uriel Singer, Ricky T. Q. Chen, Matthew Le, Ali Thabet, Albert Pumarola, Yaron Lipman

    Abstract: This paper introduces Bespoke Non-Stationary (BNS) Solvers, a solver distillation approach to improve sample efficiency of Diffusion and Flow models. BNS solvers are based on a family of non-stationary solvers that provably subsumes existing numerical ODE solvers and consequently demonstrate considerable improvement in sample approximation (PSNR) over these baselines. Compared to model distillatio… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  6. arXiv:2401.03228  [pdf, other

    stat.ML cs.LG

    Reflected Schrödinger Bridge for Constrained Generative Modeling

    Authors: Wei Deng, Yu Chen, Nicole Tianjiao Yang, Hengrong Du, Qi Feng, Ricky T. Q. Chen

    Abstract: Diffusion models have become the go-to method for large-scale generative models in real-world applications. These applications often involve data distributions confined within bounded domains, typically requiring ad-hoc thresholding techniques for boundary enforcement. Reflected diffusion models (Lou23) aim to enhance generalizability by generating the data distribution through a backward process… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  7. arXiv:2312.05250  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    TaskMet: Task-Driven Metric Learning for Model Learning

    Authors: Dishank Bansal, Ricky T. Q. Chen, Mustafa Mukadam, Brandon Amos

    Abstract: Deep learning models are often deployed in downstream tasks that the training procedure may not be aware of. For example, models solely trained to achieve accurate predictions may struggle to perform well on downstream tasks because seemingly small prediction errors may incur drastic task errors. The standard end-to-end learning approach is to make the task loss differentiable or to introduce a di… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  8. arXiv:2312.02027  [pdf, other

    math.OC cs.LG math.NA math.PR stat.ML

    Stochastic Optimal Control Matching

    Authors: Carles Domingo-Enrich, Jiequn Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen

    Abstract: Stochastic optimal control, which has the goal of driving the behavior of noisy systems, is broadly applicable in science, engineering and artificial intelligence. Our work introduces Stochastic Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal control that stems from the same philosophy as the conditional score matching loss for diffu… ▽ More

    Submitted 28 June, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  9. arXiv:2311.13518  [pdf, other

    physics.chem-ph quant-ph

    Orbital-Free Density Functional Theory with Continuous Normalizing Flows

    Authors: Alexandre de Camargo, Ricky T. Q. Chen, Rodrigo A. Vargas-Hernández

    Abstract: Orbital-free density functional theory (OF-DFT) provides an alternative approach for calculating the molecular electronic energy, relying solely on the electron density. In OF-DFT, both the ground-state density is optimized variationally to minimize the total energy functional while satisfying the normalization constraint. In this work, we introduce a novel approach by parameterizing the electroni… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures

  10. arXiv:2311.13443  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Guided Flows for Generative Modeling and Decision Making

    Authors: Qinqing Zheng, Matt Le, Neta Shaul, Yaron Lipman, Aditya Grover, Ricky T. Q. Chen

    Abstract: Classifier-free guidance is a key component for enhancing the performance of conditional generative models across diverse tasks. While it has previously demonstrated remarkable improvements for the sample quality, it has only been exclusively employed for diffusion models. In this paper, we integrate classifier-free guidance into Flow Matching (FM) models, an alternative simulation-free approach t… ▽ More

    Submitted 7 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

  11. arXiv:2310.19075  [pdf, other

    cs.LG cs.AI cs.CV

    Bespoke Solvers for Generative Flow Models

    Authors: Neta Shaul, Juan Perez, Ricky T. Q. Chen, Ali Thabet, Albert Pumarola, Yaron Lipman

    Abstract: Diffusion or flow-based models are powerful generative paradigms that are notoriously hard to sample as samples are defined as solutions to high-dimensional Ordinary or Stochastic Differential Equations (ODEs/SDEs) which require a large Number of Function Evaluations (NFE) to approximate well. Existing methods to alleviate the costly sampling process include model distillation and designing dedica… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  12. arXiv:2310.04432  [pdf, other

    cs.CV cs.AI cs.LG

    Training-free Linear Image Inverses via Flows

    Authors: Ashwini Pokle, Matthew J. Muckley, Ricky T. Q. Chen, Brian Karrer

    Abstract: Solving inverse problems without any training involves using a pretrained generative model and making appropriate modifications to the generation process to avoid finetuning of the generative model. While recent methods have explored the use of diffusion models, they still require the manual tuning of many hyperparameters for different inverse problems. In this work, we propose a training-free met… ▽ More

    Submitted 10 March, 2024; v1 submitted 25 September, 2023; originally announced October 2023.

    Comments: 40 pages, 30 figures. Added additional qualitative results in the appendix

  13. arXiv:2310.02679  [pdf, other

    cs.LG cs.AI stat.CO stat.ME stat.ML

    Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

    Authors: Dinghuai Zhang, Ricky T. Q. Chen, Cheng-Hao Liu, Aaron Courville, Yoshua Bengio

    Abstract: We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajector… ▽ More

    Submitted 9 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted by ICLR 2024

  14. arXiv:2310.02233  [pdf, other

    stat.ML cs.LG math.OC

    Generalized Schrödinger Bridge Matching

    Authors: Guan-Horng Liu, Yaron Lipman, Maximilian Nickel, Brian Karrer, Evangelos A. Theodorou, Ricky T. Q. Chen

    Abstract: Modern distribution matching algorithms for training diffusion or flow models directly prescribe the time evolution of the marginal distributions between two boundary distributions. In this work, we consider a generalized distribution matching setup, where these marginals are only implicitly described as a solution to some task-specific objective function. The problem setup, known as the Generaliz… ▽ More

    Submitted 18 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 Camera Ready

  15. arXiv:2306.06626  [pdf, other

    cs.LG stat.ML

    On Kinetic Optimal Probability Paths for Generative Models

    Authors: Neta Shaul, Ricky T. Q. Chen, Maximilian Nickel, Matt Le, Yaron Lipman

    Abstract: Recent successful generative models are trained by fitting a neural network to an a-priori defined tractable probability density path taking noise to training examples. In this paper we investigate the space of Gaussian probability paths, which includes diffusion paths as an instance, and look for an optimal member in some useful sense. In particular, minimizing the Kinetic Energy (KE) of a path i… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  16. arXiv:2304.14772  [pdf, other

    cs.LG

    Multisample Flow Matching: Straightening Flows with Minibatch Couplings

    Authors: Aram-Alexandre Pooladian, Heli Ben-Hamu, Carles Domingo-Enrich, Brandon Amos, Yaron Lipman, Ricky T. Q. Chen

    Abstract: Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underlying structure in the data distribution for constru… ▽ More

    Submitted 24 May, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

  17. arXiv:2302.05793  [pdf, other

    cs.LG cs.AI stat.CO stat.ML

    Distributional GFlowNets with Quantile Flows

    Authors: Dinghuai Zhang, Ling Pan, Ricky T. Q. Chen, Aaron Courville, Yoshua Bengio

    Abstract: Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating complex combinatorial structure through a series of decision-making steps. Despite being inspired from reinforcement learning, the current GFlowNet framework is relatively limited in its applicability and cannot handle stochasticity in the reward function. In thi… ▽ More

    Submitted 17 February, 2024; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted by TMLR

  18. arXiv:2302.03660  [pdf, other

    cs.LG cs.AI stat.ML

    Flow Matching on General Geometries

    Authors: Ricky T. Q. Chen, Yaron Lipman

    Abstract: We propose Riemannian Flow Matching (RFM), a simple yet powerful framework for training continuous normalizing flows on manifolds. Existing methods for generative modeling on manifolds either require expensive simulation, are inherently unable to scale to high dimensions, or use approximations for limiting quantities that result in biased training objectives. Riemannian Flow Matching bypasses thes… ▽ More

    Submitted 26 February, 2024; v1 submitted 7 February, 2023; originally announced February 2023.

    Journal ref: ICLR 2024

  19. arXiv:2212.13659  [pdf, other

    cs.LG stat.ML

    Latent Discretization for Continuous-time Sequence Compression

    Authors: Ricky T. Q. Chen, Matthew Le, Matthew Muckley, Maximilian Nickel, Karen Ullrich

    Abstract: Neural compression offers a domain-agnostic approach to creating codecs for lossy or lossless compression via deep generative models. For sequence compression, however, most deep sequence models have costs that scale with the sequence length rather than the sequence complexity. In this work, we instead treat data sequences as observations from an underlying continuous-time process and learn how to… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

  20. arXiv:2210.02747  [pdf, other

    cs.LG cs.AI stat.ML

    Flow Matching for Generative Modeling

    Authors: Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, Matt Le

    Abstract: We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability… ▽ More

    Submitted 8 February, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

  21. arXiv:2210.01741  [pdf, other

    cs.LG

    Neural Conservation Laws: A Divergence-Free Perspective

    Authors: Jack Richter-Powell, Yaron Lipman, Ricky T. Q. Chen

    Abstract: We investigate the parameterization of deep neural networks that by design satisfy the continuity equation, a fundamental conservation law. This is enabled by the observation that any solution of the continuity equation can be represented as a divergence-free vector field. We hence propose building divergence-free neural networks through the concept of differential forms, and with the aid of autom… ▽ More

    Submitted 11 December, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

    Journal ref: NeurIPS 2022

  22. arXiv:2210.00999  [pdf, other

    cs.LG cs.AI stat.ML

    Latent State Marginalization as a Low-cost Approach for Improving Exploration

    Authors: Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen

    Abstract: While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity. In this work, we propose the adoption of latent variable policies within the MaxEnt framework… ▽ More

    Submitted 10 February, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: Accepted by ICLR 2023

  23. arXiv:2209.02606  [pdf, other

    cs.LG cs.AI stat.ML

    Unifying Generative Models with GFlowNets and Beyond

    Authors: Dinghuai Zhang, Ricky T. Q. Chen, Nikolay Malkin, Yoshua Bengio

    Abstract: There are many frameworks for deep generative modeling, each often presented with their own specific training algorithms and inference methods. Here, we demonstrate the connections between existing deep generative models and the recently introduced GFlowNet framework, a probabilistic inference machine which treats sampling as a decision-making process. This analysis sheds light on their overlappin… ▽ More

    Submitted 30 January, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: expanded version of the ICML 2022 workshop paper

  24. arXiv:2207.09442  [pdf, other

    cs.RO cs.CV cs.LG math.OC

    Theseus: A Library for Differentiable Nonlinear Optimization

    Authors: Luis Pineda, Taosha Fan, Maurizio Monge, Shobha Venkataraman, Paloma Sodhi, Ricky T. Q. Chen, Joseph Ortiz, Daniel DeTone, Austin Wang, Stuart Anderson, **g Dong, Brandon Amos, Mustafa Mukadam

    Abstract: We present Theseus, an efficient application-agnostic open source library for differentiable nonlinear least squares (DNLS) optimization built on PyTorch, providing a common framework for end-to-end structured learning in robotics and vision. Existing DNLS implementations are application specific and do not always incorporate many ingredients important for efficiency. Theseus is application-agnost… ▽ More

    Submitted 18 January, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Advances in Neural Information Processing Systems (NeurIPS), 2022

  25. arXiv:2207.04711  [pdf, other

    stat.ML cs.LG

    Matching Normalizing Flows and Probability Paths on Manifolds

    Authors: Heli Ben-Hamu, Samuel Cohen, Joey Bose, Brandon Amos, Aditya Grover, Maximilian Nickel, Ricky T. Q. Chen, Yaron Lipman

    Abstract: Continuous Normalizing Flows (CNFs) are a class of generative models that transform a prior distribution to a model distribution by solving an ordinary differential equation (ODE). We propose to train CNFs on manifolds by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path. PPD i… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: ICML 2022

  26. arXiv:2203.06832  [pdf, other

    cs.LG stat.ML

    Semi-Discrete Normalizing Flows through Differentiable Tessellation

    Authors: Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel

    Abstract: Map** between discrete and continuous distributions is a difficult task and many have had to resort to heuristical approaches. We propose a tessellation-based approach that directly learns quantization boundaries in a continuous space, complete with exact likelihood evaluations. This is done through constructing normalizing flows on convex polytopes parameterized using a simple homeomorphism wit… ▽ More

    Submitted 11 December, 2022; v1 submitted 13 March, 2022; originally announced March 2022.

    Journal ref: NeurIPS 2022

  27. arXiv:2103.12604  [pdf, other

    quant-ph physics.chem-ph physics.comp-ph

    Fully differentiable optimization protocols for non-equilibrium steady states

    Authors: Rodrigo A. Vargas-Hernández, Ricky T. Q. Chen, Kenneth A. Jung, Paul Brumer

    Abstract: In the case of quantum systems interacting with multiple environments, the time-evolution of the reduced density matrix is described by the Liouvillian. For a variety of physical observables, the long-time limit or steady state solution is needed for the computation of desired physical observables. For inverse design or optimal control of such systems, the common approaches are based on brute-forc… ▽ More

    Submitted 23 November, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Main work 10 pages and 5 Figures. Supplemental Material 12 pages, 1 Figure, 3 Tables

  28. arXiv:2102.06559  [pdf, other

    stat.ML cs.LG

    Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations

    Authors: Winnie Xu, Ricky T. Q. Chen, Xuechen Li, David Duvenaud

    Abstract: We perform scalable approximate inference in continuous-depth Bayesian neural networks. In this model class, uncertainty about separate weights in each layer gives hidden units that follow a stochastic differential equation. We demonstrate gradient-based stochastic variational inference in this infinite-parameter setting, producing arbitrarily-flexible approximate posteriors. We also derive a nove… ▽ More

    Submitted 30 January, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

  29. arXiv:2012.05942  [pdf, other

    cs.LG math.OC

    Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

    Authors: Chin-Wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville

    Abstract: Flow-based models are powerful tools for designing probabilistic models with tractable density. This paper introduces Convex Potential Flows (CP-Flow), a natural and efficient parameterization of invertible models inspired by the optimal transport (OT) theory. CP-Flows are the gradient map of a strongly convex neural potential function. The convexity implies invertibility and allows us to resort t… ▽ More

    Submitted 23 February, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

  30. arXiv:2011.12808  [pdf, other

    quant-ph physics.chem-ph physics.comp-ph

    Inverse design of dissipative quantum steady-states with implicit differentiation

    Authors: Rodrigo A. Vargas-Hernández, Ricky T. Q. Chen, Kenneth A. Jung, Paul Brumer

    Abstract: Inverse design of a property that depends on the steady-state of an open quantum system is commonly done by grid-search type of methods. In this paper we present a new methodology that allows us to compute the gradient of the steady-state of an open quantum system with respect to any parameter of the Hamiltonian using the implicit differentiation theorem. As an example, we present a simulation of… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: 6 pages, 2 figures, accepted for publication in Third Workshop on Machine Learning and the Physical Sciences (NeurIPS 2020), Vancouver, Canada

  31. arXiv:2011.04803  [pdf, other

    cs.LG

    Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering

    Authors: Ricky T. Q. Chen, Dami Choi, Lukas Balles, David Duvenaud, Philipp Hennig

    Abstract: Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters. Based on this intuition, we explore the use of exact per-sample Hessian-vector products and gradients to construct optimizers that are self-tuning and hyper… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

  32. arXiv:2011.04583  [pdf, other

    cs.LG

    Neural Spatio-Temporal Point Processes

    Authors: Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel

    Abstract: We propose a new class of parameterizations for spatio-temporal point processes which leverage Neural ODEs as a computational method and enable flexible, high-fidelity models of discrete events that are localized in continuous time and space. Central to our approach is a combination of continuous-time neural networks with two novel neural architectures, i.e., Jump and Attentive Continuous-time Nor… ▽ More

    Submitted 17 March, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Journal ref: ICLR 2021

  33. arXiv:2011.03902  [pdf, other

    cs.LG stat.ML

    Learning Neural Event Functions for Ordinary Differential Equations

    Authors: Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel

    Abstract: The existing Neural ODE formulation relies on an explicit knowledge of the termination time. We extend Neural ODEs to implicitly defined termination criteria modeled by neural event functions, which can be chained together and differentiated through. Neural Event ODEs are capable of modeling discrete and instantaneous changes in a continuous-time system, without prior knowledge of when these chang… ▽ More

    Submitted 27 October, 2021; v1 submitted 7 November, 2020; originally announced November 2020.

    Journal ref: ICLR 2021

  34. arXiv:2009.09457  [pdf, other

    cs.LG math.CA

    "Hey, that's not an ODE": Faster ODE Adjoints via Seminorms

    Authors: Patrick Kidger, Ricky T. Q. Chen, Terry Lyons

    Abstract: Neural differential equations may be trained by backpropagating gradients via the adjoint method, which is another differential equation typically solved using an adaptive-step-size numerical differential equation solver. A proposed step is accepted if its error, \emph{relative to some norm}, is sufficiently small; else it is rejected, the step is shrunk, and the process is repeated. Here, we demo… ▽ More

    Submitted 10 May, 2021; v1 submitted 20 September, 2020; originally announced September 2020.

    Comments: Published at ICML 2021

  35. arXiv:2004.00353  [pdf, other

    cs.LG stat.ML

    SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models

    Authors: Yucen Luo, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ryan P. Adams, Ricky T. Q. Chen

    Abstract: Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimiz… ▽ More

    Submitted 10 July, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

    Comments: ICLR 2020

  36. arXiv:2001.01328  [pdf, other

    cs.LG math.NA stat.ML

    Scalable Gradients for Stochastic Differential Equations

    Authors: Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, David Duvenaud

    Abstract: The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for c… ▽ More

    Submitted 18 October, 2020; v1 submitted 5 January, 2020; originally announced January 2020.

    Comments: AISTATS 2020; 25 pages, 6 figures in main text; clarify notation in appendix

  37. arXiv:1912.03579  [pdf, other

    cs.LG stat.ML

    Neural Networks with Cheap Differential Operators

    Authors: Ricky T. Q. Chen, David Duvenaud

    Abstract: Gradients of neural networks can be computed efficiently for any architecture, but some applications require differential operators with higher time complexity. We describe a family of restricted neural network architectures that allow efficient computation of a family of differential operators involving dimension-wise derivatives, used in cases such as computing the divergence. Our proposed archi… ▽ More

    Submitted 7 December, 2019; originally announced December 2019.

    Comments: NeurIPS 2019

  38. arXiv:1907.03907  [pdf, other

    cs.LG stat.ML

    Latent ODEs for Irregularly-Sampled Time Series

    Authors: Yulia Rubanova, Ricky T. Q. Chen, David Duvenaud

    Abstract: Time series with non-uniform intervals occur in many applications, and are difficult to model using standard recurrent neural networks (RNNs). We generalize RNNs to have continuous-time hidden dynamics defined by ordinary differential equations (ODEs), a model we call ODE-RNNs. Furthermore, we use ODE-RNNs to replace the recognition network of the recently-proposed Latent ODE model. Both ODE-RNNs… ▽ More

    Submitted 8 July, 2019; originally announced July 2019.

  39. arXiv:1906.02735  [pdf, other

    stat.ML cs.LG

    Residual Flows for Invertible Generative Modeling

    Authors: Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen

    Abstract: Flow-based generative models parameterize probability distributions through an invertible transformation and can be trained by maximum likelihood. Invertible residual networks provide a flexible family of transformations where only Lipschitz conditions rather than strict architectural constraints are needed for enforcing invertibility. However, prior work trained invertible residual networks for d… ▽ More

    Submitted 23 July, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019

  40. arXiv:1811.00995  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Invertible Residual Networks

    Authors: Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Jörn-Henrik Jacobsen

    Abstract: We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation. Typically, enforcing invertibility requires partitioning dimensions or restricting network architectures. In contrast, our approach only requires adding a simple normalization step during training, already available in standard frameworks. In… ▽ More

    Submitted 18 May, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Journal ref: Proceedings of the International Conference on Machine Learning (ICML), 2019

  41. arXiv:1810.01367  [pdf, other

    cs.LG cs.CV stat.ML

    FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models

    Authors: Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, David Duvenaud

    Abstract: A promising class of generative models maps points from a simple distribution to a complex distribution through an invertible neural network. Likelihood-based training of these models requires restricting their architectures to allow cheap computation of Jacobian determinants. Alternatively, the Jacobian trace can be used if the transformation is specified by an ordinary differential equation. In… ▽ More

    Submitted 22 October, 2018; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: 8 Pages, 6 figures

  42. arXiv:1806.07366  [pdf, other

    cs.LG cs.AI stat.ML

    Neural Ordinary Differential Equations

    Authors: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud

    Abstract: We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly… ▽ More

    Submitted 13 December, 2019; v1 submitted 19 June, 2018; originally announced June 2018.

  43. arXiv:1802.04942  [pdf, other

    cs.LG cs.AI stat.ML

    Isolating Sources of Disentanglement in Variational Autoencoders

    Authors: Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud

    Abstract: We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables. We use this to motivate our $β$-TCVAE (Total Correlation Variational Autoencoder), a refinement of the state-of-the-art $β$-VAE objective for learning disentangled representations, requiring no additional hyperparameters during training. We further propose a principled cl… ▽ More

    Submitted 23 April, 2019; v1 submitted 13 February, 2018; originally announced February 2018.