Skip to main content

Showing 1–12 of 12 results for author: Dockhorn, T

.
  1. arXiv:2403.12015  [pdf, other

    cs.CV

    Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

    Authors: Axel Sauer, Frederic Boesel, Tim Dockhorn, Andreas Blattmann, Patrick Esser, Robin Rombach

    Abstract: Diffusion models are the main driver of progress in image and video synthesis, but suffer from slow inference speed. Distillation methods, like the recently introduced adversarial diffusion distillation (ADD) aim to shift the model from many-shot to single-step inference, albeit at the cost of expensive and difficult optimization due to its reliance on a fixed pretrained DINOv2 discriminator. We i… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  2. arXiv:2403.03206  [pdf, other

    cs.CV

    Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

    Authors: Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, Robin Rombach

    Abstract: Diffusion models create data from noise by inverting the forward paths of data towards noise and have emerged as a powerful generative modeling technique for high-dimensional, perceptual data such as images and videos. Rectified flow is a recent generative model formulation that connects data and noise in a straight line. Despite its better theoretical properties and conceptual simplicity, it is n… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  3. arXiv:2311.15127  [pdf, other

    cs.CV

    Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

    Authors: Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram Voleti, Adam Letts, Varun Jampani, Robin Rombach

    Abstract: We present Stable Video Diffusion - a latent video diffusion model for high-resolution, state-of-the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained for 2D image synthesis have been turned into generative video models by inserting temporal layers and finetuning them on small, high-quality video datasets. However, training methods in the literature vary wi… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  4. arXiv:2307.01952  [pdf, other

    cs.CV cs.AI

    SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

    Authors: Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, Robin Rombach

    Abstract: We present SDXL, a latent diffusion model for text-to-image synthesis. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone: The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder. We design multiple novel conditioning schemes and train SDXL on multiple aspect ra… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  5. arXiv:2304.08818  [pdf, other

    cs.CV cs.LG

    Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

    Authors: Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

    Abstract: Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding excessive compute demands by training a diffusion model in a compressed lower-dimensional latent space. Here, we apply the LDM paradigm to high-resolution video generation, a particularly resource-intensive task. We first pre-train an LDM on images only; then, we turn the image generator into a video generator by int… ▽ More

    Submitted 27 December, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2023. Project page: https://research.nvidia.com/labs/toronto-ai/VideoLDM/

  6. arXiv:2211.14169  [pdf, other

    q-bio.QM stat.ML

    Latent Space Diffusion Models of Cryo-EM Structures

    Authors: Karsten Kreis, Tim Dockhorn, Zihao Li, Ellen Zhong

    Abstract: Cryo-electron microscopy (cryo-EM) is unique among tools in structural biology in its ability to image large, dynamic protein complexes. Key to this ability is image processing algorithms for heterogeneous cryo-EM reconstruction, including recent deep learning-based approaches. The state-of-the-art method cryoDRGN uses a Variational Autoencoder (VAE) framework to learn a continuous distribution of… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: Machine Learning for Structural Biology Workshop, NeurIPS 2022 (Oral)

  7. arXiv:2210.09929  [pdf, other

    stat.ML cs.CR cs.LG

    Differentially Private Diffusion Models

    Authors: Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis

    Abstract: While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained with differential privacy (DP) on sensitive data can sidestep this challenge, providing access to synthetic data instead. We build on the recent success of diffusion models (DMs) and introduce Differentially Private Diffusion Models (DPDMs… ▽ More

    Submitted 30 December, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted at TMLR (https://openreview.net/forum?id=ZPpQk7FJXF)

  8. arXiv:2210.05475  [pdf, other

    stat.ML cs.LG

    GENIE: Higher-Order Denoising Diffusion Solvers

    Authors: Tim Dockhorn, Arash Vahdat, Karsten Kreis

    Abstract: Denoising diffusion models (DDMs) have emerged as a powerful class of generative models. A forward diffusion process slowly perturbs the data, while a deep model learns to gradually denoise. Synthesis amounts to solving a differential equation (DE) defined by the learnt model. Solving the DE requires slow iterative solvers for high-quality generation. In this work, we propose Higher-Order Denoisin… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  9. arXiv:2112.07068  [pdf, other

    stat.ML cs.LG

    Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

    Authors: Tim Dockhorn, Arash Vahdat, Karsten Kreis

    Abstract: Score-based generative models (SGMs) have demonstrated remarkable synthesis quality. SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the generative model learns to denoise. The complexity of this denoising task is, apart from the data distribution itself, uniquely determined by the diffusion process. We argue that current SGMs employ overly… ▽ More

    Submitted 25 March, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: ICLR 2022 (Spotlight)

  10. arXiv:2110.13220  [pdf, other

    cs.LG stat.ML

    Demystifying and Generalizing BinaryConnect

    Authors: Tim Dockhorn, Yaoliang Yu, Eyyüb Sari, Mahdi Zolnouri, Vahid Partovi Nia

    Abstract: BinaryConnect (BC) and its many variations have become the de facto standard for neural network quantization. However, our understanding of the inner workings of BC is still quite limited. We attempt to close this gap in four different aspects: (a) we show that existing quantization algorithms, including post-training quantization, are surprisingly similar to each other; (b) we argue for proximal… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021

  11. arXiv:2006.09396  [pdf, other

    stat.ML cs.LG

    Density Deconvolution with Normalizing Flows

    Authors: Tim Dockhorn, James A. Ritchie, Yaoliang Yu, Iain Murray

    Abstract: Density deconvolution is the task of estimating a probability density function given only noise-corrupted samples. We can fit a Gaussian mixture model to the underlying density by maximum likelihood if the noise is normally distributed, but would like to exploit the superior density estimation performance of normalizing flows and allow for arbitrary noise distributions. Since both adjustments lead… ▽ More

    Submitted 13 July, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: Appearing at the second workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (ICML 2020), Virtual Conference. 8 pages, 6 figures, 5 tables

  12. arXiv:1904.07200  [pdf, other

    cs.LG math.NA stat.ML

    A Discussion on Solving Partial Differential Equations using Neural Networks

    Authors: Tim Dockhorn

    Abstract: Can neural networks learn to solve partial differential equations (PDEs)? We investigate this question for two (systems of) PDEs, namely, the Poisson equation and the steady Navier--Stokes equations. The contributions of this paper are five-fold. (1) Numerical experiments show that small neural networks (< 500 learnable parameters) are able to accurately learn complex solutions for systems of part… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: 9 pages, 2 figures